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CRITERIA FOR DETERMINING THE 
READABILITY OF TYPE FACES 


MILES A. TINKER 


University of Minnesota 


Until recently the term ‘legibility’ found wide usage in dis- 
cussing hygienic printing. ‘“‘Legibility of print”’ was a favorite 
phrase. It became increasingly evident, however, that the 
meaning of legibility was anything but clear. Quickness of 
perception, speed of reading, perceptibility, and other criteria of 
legibility have been employed.* Because of this unsatisfactory 
condition, a less equivocal concept has been sought. 

In recent years writers have come more and more to talk about 
readability of print. Paterson and Tinker‘ and Luckiesh and 
Moss! have been the leaders in developing the concept of reada- 
bility. Apparently there is a marked disagreement, however, 
as to what constitutes readability, or rather what factors are 
most important in determining readability. Paterson and 
Tinker,‘ in ‘‘ How to Make Type Readable,” have used the words 
legibility and readability interchangeably to signify ease and 
speed of reading printed material under normal reading con- 
ditions. In addition they have recognized the problem of visi- 
bility and perceptibility of printed material. Therefore, although 
emphasizing speed of reading as a criterion of readability, they 
are aware that other criteria may be employed if adequate 
validity is established. Luckiesh and Moss! reject speed of 
reading as a criterion of readability on the basis of inadequate 
data and emphasize mainly rate of involuntary blinking, and 
visibility as criteria. 

It would seem that the term readability of print is tending to 
become as equivocal in its meaning as legibility of print. Obvi- 
ously it is desirable to examine the validity of various so-called 
criteria of readability. 
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A first step in such a program is to discover how well certain 
criteria of readability correspond with each other, employing 
exactly the same kind of printed material to be read, perceived or 
discriminated by the subject. This investigation attempts to 
do that. The purpose is to compare the visibility, the per- 
ceptibility at a distance, and the speed of reading of materials 
printed in ten type faces. 

The data on speed of reading and on perceptibility at a distance 
for the type faces have been published. These data will be 
compared with the results obtained in this experiment in which 
visibility of text for the same ten type faces was measured. 

The ten book type faces employed in this study were: Scotch 
Roman, Garamont, Antique, Bodoni, Old Style, Caslon Old 
Style, Cheltenham, Kabel Light, American Typewriter, Cloister 
Black (Old English). They were printed in ten-point type, set 
solid in nineteen pica line width on enamel paper stock. _ Illus- 
trations of the type faces are shown in Paterson and Tinker.® 

In the visibility study, thirty five-letter words were cut from 
the text for each type face and each word mounted at the center 
of a four-by-six-inch white index card. 

The experiment was carried out in a light laboratory with 
ten-foot-candles of general illumination. Visibility measure- 
ments were made with the Luckiesh-Moss Visibility Meter.’ 
This instrument yields precise scale values at the place where the 
subject is able to first discriminate the test object (read the word 
on the card). The visibility meter was mounted at a constant 
distance of fifteen inches from the test card which was held on a 
slanting reading stand. The subject slowly rotated the filters 
in the apparatus until the word could be apprehended. The 
scale value corresponding to the reading was recorded, the filter 
set back to opaqueness and the next stimulus card placed on the 
stand. Observations were continued until all thirty words had 
been read. 

In all, thirty-six university students served as subjects in the 
experiment, four in each of the nine subgroups listed below. All 
had normal or adequately corrected vision. Each subject read 
Scotch Roman and one other type face. The thirty words in 
Scotch Roman served as a standard in each test group. For 
the variation of type face, a second series of thirty words was 
employed. The words in the two series are of practically equal 
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familiarity according to Thorndike’s Teacher’s Word Book. 
Four subjects read the Scotch Roman (standard) and one of the 
other type faces. Since there were nine type faces besides the 
Scotch Roman, there were nine experimental subgroups of four 
subjects each. In presenting the stimulus words, the Scotch 
Roman and comparison type face were systematically varied to 
control practice and fatigue effects. 


RESULTS AND DISCUSSION 


Basic data of the study are given in Table I. For each 
subject, the mean visibility score is an average of thirty readings. 
The mean for all four subjects is derived from one hundred 
twenty readings. Examination of this table reveals marked 
individual differences within the subgroups as well as in the total 
group (for Scotch Roman). The most striking thing revealed 
in Table I is the stability of the trends between Scotch Roman 
and the other type face from subject to subject within any 
subgroup. ‘There is not a single reversal of trend anywhere in 
the table. As a matter of fact, the results of one subject in each 
subgroup would yield about the same trend as is found in the 
average for four subjects. One may conclude that the visibility 
meter yields highly stable scores. 

The data were further analyzed and the results are presented 
in Table II. In column 3 are the mean visibility scores, each 
based upon one hundred twenty readings. The differences in 
visibility between the standard (Scotch Roman) and each of the 
other type faces are given in column 5; the percentage differences 
in column 6. In column 7 are given the critical ratios. These 
show that all differences are statistically significant. The 
visibility ranks, in terms of difference between standard and each 
of other type faces, isgivenincolumn 2. The visibility of Antique 
is greatest, and of Scotch Roman, least. The separation between 
ranks is not great in some instances. Thus, in ranks 2 and 3, 
Cheltenham and American Typewriter have, respectively, 38.6 
and 36.5 per cent greater visibility than the standard. A similar 
situation occurs in ranks 5 and 6 for Bodoni and Garamont, 
and in ranks 8 and 9 for Caslon Old Style and Kabel Light. 
Otherwise the percentages are well separated in value. Those 
type faces in the high ranks tend in appearance toward a bold 
face type, and those in low ranks toward a light face. Luckiesh 
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TABLE I.—MEAN VISIBILITY SCORE FOR Eacu SUBJECT IN Eacu 
SUBGROUP 


Sub- Sub- Sub- Sub- All 4 
ject ject ject ject Sub- 


Group Type Face No. 1 No. 2 No. 3 No. 4 jects* 
I Scotch Roman 1.74 2.28 2.90 2.75 2.41 
Antique 2.76 3.51 4.43 4.40 3.77 

II Scotch Roman 2.29 2.59 2.92 3.38 2.80 
Cheltenham 2.99 3.66 4.30 4.55 3.87 

III Scotch Roman 2.28 2.66 3.34 2.73 2.75 
American Typewriter 3.53 3.46 4.24 3.81 3.76 

IV Scotch Roman 2.71 3.46 3.42 2.98 3.14 
Cloister Black 3.28 4.07 4.55 3.81 3.93 

V Scotch Roman 3.65 2.61 3.35 2.77 3.10 
Bodoni 4.25 3.19 3.71 3.65 3.70 

VI Scotch Roman 3.40 2.36 2.56 2.97 2.82 
Garamont 3.67 2.92 3.24 3.37 3.30 

VII Scotch Roman 2.84 3.91 2.56 2.23 2.89 
Old Style 3.23 4.34 3.33 2.46 3.34 

VIII Scotch Roman 2.27 2.57 2.19 2.32 2.34 
Caslon Old Style 2.54 2.77 2.39 2.70 2.60 

IX Scotch Roman 2.72 2.70 2.44 2.94 2.68 
Kabel Light 2.88 2.91 2.79 3.02 2.90 


* Grand mean of all Scotch Roman scores is 2.77. 


and Moss! have found boldness to be an important determinant 
of visibility. Apparently other factors than degree of boldness 
help to determine visibility. Familiarity or lack of familiarity 
with the appearance seem to have an effect. Thus Cloister 
Black (Old English) with a decided tendency toward boldness is 
only in fourth place and Kabel Light, an Ultra Modern face, is in 
ninth place. Another factor is size. It is well known that, even 
though different type faces are printed in same type size (as ten 
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TaBLE I].—Tue EFrect oF VARIATIONS IN TYPE FACE ON VISsI- 
BILITY OF PRINT 


























.. | Mean 

Visi- Visi- 
Type Face bility bility | 7 

Rank| o> 

Score 
(1) (2) (3) | (4) 
Scotch Roman 2.41 | .05 
Antique 1 77 | .08 
Scotch Roman 2.80 | .06 
Cheltenham 2 3.87 | .07 
Scotch Roman 2.75 | .05 
American Typewriter 3 | 3.76 | .05 
Scotch Roman 3.14] .05 
Cloister Black 4 3.93 | .08 
Scotch Roman 3.10 | .05 
Bodoni 5 3.70 | .06 
Scotch Roman 2.82 | .06 
Garamont 6 3.30 | .05 
Scotch Roman 2.89 | .07 
Old Style 7 3.34 | .07 
Scotch Roman 2.34 | .04 
Caslon Old Style 8 | 2.60 | .04 
Scotch Roman 2.68 | .05 
Kabel Light 9 | 2.90 .04 
Scotch Roman 2.77*| .02 
Scotch Roman 10 2.77 | .02 














Difference 

Betw. Means 
in D- 
o Diff 

Per 

Score Cent 
(5) (6) (7) 
+1.36) 56.3 14.3 
+1.07| 38.6 12.5 
+1.01) 36.5 14.2 
+0.79) 25.0 8.7 
+0.60) 19.5 7.7 
+0.48) 16.9 6.4 
+0.45) 15.8 4.5 
+0.26) 11.1 5.0 
+0.22) 8.2 3.6 
0.00; 0.0 0.0 











* This mean is the grand mean for all Scotch Roman scores in first nine 
groups of table. 
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point), there are variations in the actual size of a given letter from 
face to face. Thus, Scotch Roman has relatively small letters. 

The material in Table II supports the conclusion that there are 
important differences in the visibility of type faces commonly 
used as well as for those less frequently used (as Kabel Light and 
Cloister Black vs Scotch Roman). How do these variations 
compare with variations in speed of reading, perceptibility at a 
distance, and reader opinion of legibility or readability? 

In Table III is given comparison of visibility measures obtained 
in this experiment with data derived from other studies on the 
same type faces. The per cent differences in columns 2, 4, and 
6 are differences between type face listed and the standard 
(Scotch Roman). The perceptibility at a distance scores, 
columns 4 and 5, were obtained by noting the distance from the 
eyes at which printed words could be perceived accurately and 
then computing the per cent difference between scores for the 
various type faces and the standard. The data in columns 6 
and 7 were derived in a similar manner from speed of reading 
scores. In columns 8 and 9, the data were obtained by presenting 
two hundred ten readers with samples of type and asking the 
readers to rank them according to apparent legibility. 

As stated above, the visibility scores for the various type faces 
differed significantly from the Scotch Roman standard. For 
perceptibility at a distance, all differences were significant except 
for Cloister Black, Bodoni, and Kabel Light. In speed of reading, 
however, only American Typewriter and Cloister Black differed 
significantly from the standard. 

In discussing the rankings derived from these scores, some 
reservations should be kept in mind. For example, the difference 
between certain scores are so small that the corresponding ranks 
are not actually different although they are listed as such. ‘Thus 
for visibility, ranks 2 and 3, 6 and 7; for perceptibility, ranks 5 
and 6, 6 and 7, 8 and 9, 9 and 10; for speed of reading, ranks 1, 
2, 3, 4, 5, 6, 7 and 8; for reader opinion, ranks 1 and 2, 3 and 4, 
7 and 8 probably are not reliably different, i.e., from rank 2 to 3 
for visibility, etc. 

Noting these cautions, we may examine the correspondence in 
the sets of data in Table III. Visibility appears to have much in 
common with perceptibility at a distance. The greatest dis- 
crepancy in rank appears for Cloister Black which ranks fourth 
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TaBLE III.—ComMPARISON OF VISIBILITY, PERCEPTIBILITY AT DIs- 
TANCE, SPEED OF READING, AND READER OPINIONS OF LEGI- 
BILITY FOR TEN Type Faces* 






































Reader 
cael Percepti- Speed of | Opinions of 
Visibility | bility Reading | Relative 
Type Face Com- Legibility 
parison: Scotch 
Roman Versus Per Per Per 
Cent Cent Cent Mean 
Differ-|**™*ipitter-|®*"™*|pitter-|"*"* | Rank| 2 
ence ence ence 
(1) (2) | (3) | () | ©) (6) | (7) | (8) | @&) 
Antique +56.3) 1/-14.8) 3); —0.2} 3 | 2.4 2 
Cheltenham +38.6) 2 /|—22.2)} 2/| —2.5) 8 | 2.3 l 
American Typewriter |+36.5) 3 |—37.7) 1] —5.1) 9 | 5.5 6 
Cloister Black +25.0) 4] +2.3) 10 |-—16.5) 10 | 9.8] 10 
Bodoni +19.5) 5| —4.6) 7] —1.1) 4.5) 4.2 3 
Garamont +16.9 6; -—6.8| 6); +0.5) 1 5.4 5 
Old Style +15.8 7 |-11.4 4{;-—1.1) 4.5) 4.6 4 
Caslon Old Style +11.1 8; -7.9 5); —1.3) 6 | 6.4 8 
Kabel Light +8.2; 9] +0.1) 9] —2.3) 7 | 8.2 i) 
Scotch Roman 0.0) 10 0.0; 8 0.0) 2 | 6.2 7 





* Perceptibility data adapted from Webster and Tinker;® speed of reading 
data trom Paterson and Tinker; and opinions of relative legibility from 
Paterson and Tinker.‘ 

All differences for the visibility data are statistically significant; for per- 
ceptibility data, differences of six per cent and greater are significant; and 
for speed of reading data, differences of five per cent and greater are signifi- 
cant. For perceptibility, minus differences mean better perceptibility. 


in visibility but tenth in perceptibility. The correlation between 
ranks yields a rho of + .58. In comparing visibility with speed 
of reading, the most valid comparisons are for ranks of Scotch 
Roman, Cloister Black and American Typewriter since the latter 
two were the only type faces read reliably slower than the 
standard (Scotch Roman). Here marked discrepancies in rank 
occur. The correlation between visibility and speed of reading is 
—.30. Thus there is a slight tendency for the more visible faces 
to be read slower. A similar trend is found in comparing 
perceptibility and speed of reading. Marked differences appear 
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in the respective ranks and the correlation is —.33. Note that 
Scotch Roman, which has very low visibility and perceptibility 
scores, is read as fast as all of the other commonly used type faces. 

Reader opinion of legibility (judged in terms of ease and com- 
fort for reading) shows interesting relations to the other measures. 
The ranks for judged legibility correspond more closely to 
visibility and perceptibility ranks than to speed of reading ranks. 
The correlations follow: 


Visibility vs. judged legibility....... +.58 
Perceptibility vs. judged legibility....... +.67 
Speed of reading vs. judged legibility....... + .33 


Thus we find that readers judge type faces which are most readily 
perceived at a distance to be best for comfortable reading. This 
is true to a lesser degree for the visibility scores. It will be noted, 
however, that Cloister Black which is read slowest of all, is also 
considered hardest to read. Also note that American Type- 
writer, which is read significantly slower than commonly used 
type faces but which has a high rating for visibility and per- 
ceptibility, is considered relatively difficult to read. Tinker and 
Paterson® have shown that aesthetic preferences tend to coincide 
closely with reader opinion of relative legibility. For practical 
purposes, therefore, reader preference and reader judgment of 
legibility or readability mean the same. Thus in our study we 
find readers preferring as well as considering best for reading 
those type faces which tend toward boldness as Antique and 
Cheltenham. But there are exceptions. Although Cloister 
Black and American Typewriter are highly visible, they are 
neither liked nor considered readable. Incidentally, some unpub- 
lished work by Paterson and Tinker shows that readers prefer a 
medium sized type (ten and eleven point) rather than larger 
sizes. Here again reader opinion agrees with speed of reading 
results rather than with visibility. In general we find, therefore, 
that of the type faces read equally fast, readers prefer those that 
are more perceptible and of high visibility, but the readers do not 
prefer type faces which retard speed of reading even though these 
type faces have high visibility. 

The results in Table III demonstrate, therefore, that various 
measures are far from agreement with each other. Is there any 
suggestion as to which ones are the more valid as measures of 
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readability? In the first place both visibility and perceptibility 
at a distance are measures that do not represent normal reading 
situations; in fact they are highly artificial. It would seem that, 
in the final analysis, readability should be measured during 
normal (ordinary) reading if the measure is to be valid. Further- 
more, visibility and perceptibility scores as criteria of readability 
can lead to absurd conclusions. Thus large type sizes (fourteen, 
eighteen, twenty-four point, etc.) yield high visibility and 
perceptibility scores but they produce textual material which 
prevent the effective use of word-forms (configurations) as clues 
in perception, which is highly important for smooth and fast 
reading. The larger type sizes reduce the reader to perceiving 
words in sections rather than in wholes which is an undesirable 
practice. Similarly, the fact that the larger type sizes produce 
fewer words per unit horizontal distance prevents maximal use 
of peripheral vision in reading. Another example is reading 
lower-case versus material in all-capitals. Tinker’? has shown 
that perceptibility scores for all-capitals are markedly greater 
than for lower-case. But again effective perception is prevented 
because of the absence of word-form clues. So material in all- 
capitals is read slower than lower-case printing and is disliked. 
The same argument holds for certain variations in type faces, 
i.e, with American Typewriter and Cloister Black, where 
familiar word-forms are partially eliminated or are altered. 

There are, however, certain situations where visibility and 
perceptibility are obviously factors in readability: (1) For 
relatively small type sizes, as six and seven point, visibility and 
perceptibility reduces readability. (2) Variation in brightness 
contrast between print and background affects readability and 
this is due to variations in perceptibility. For example, Preston, 
Schwankl, and Tinker’ found a correlation of about +.86 
between perceptibility and speed of reading for variations in 
color of print and background. Material with least brightness 
contrast between print and paper was read slowest and had 
least perceptibility score. 

Of the techniques employed in this study,! speed of reading 
appears to provide the best possibilities as a measure of reada- 
bility. It provides measurement in a normal, ordinary reading 





’ Evaluation of the ‘blink technique”’ as a measure of readability will 
be made in a later paper. 
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situation. Satisfactory controls, such as use of equivalent forms 
and checks on comprehension, are provided. Contrary to the 
contentions of Luckiesh and Moss,! an adequately controlled 
speed of reading technique is a fairly sensitive indicator of 
readability. For instance, it detects just as consistently as 
visibility and perceptibility measures, the decrement in reada- 
bility for small type sizes and for poor contrast between print 
and paper. Distinct advantages of speed of reading as a measure 
of readability include the following: 

1) For variations in type size, it shows not only the retarding 
effect due to poor visibility of small type, but also detects the 
levels at which larger type sizes disrupt smooth reading due 
(a) to inadequate word-form clues, and (b) to inability to make 
maximum use of peripheral vision. Thus, an optimal size of 
type, centered around ten to eleven point, is found to be more 
readable than either small or large type sizes. Luckiesh and 
Moss! (p. 137) dodge the issue in their attempt to reconcile high 
visibility of large type sizes with poor readability. 

2) For variations in contrast between print and_back- 
ground, speed of reading readily detects conditions deleterious 
to readability. 

3) For variations in type face, speed of reading as a measure of 
readability detects those variations which reduce word-form clues 
and thus disrupt smooth reading even though visibility and 
perceptibility of the printed characters are high. Furthermore, 
a speed of reading technique detects those type faces which are 
equally readable although there may be differences in visibility 
and perceptibility due to variation in boldness. In other words, 
above a certain threshold accentuation of boldness does not alter 
readability although it does change visibility and perceptibility 
at a distance. 

4) It is obvious that visibility and perceptibility at a distance 
are of no use in measuring readability for typographical variations 
such as leading, line width, size of margins and inter-columnar 
spacing, while speed of reading does. 

In view of the considerations listed above, it would seem that 
speed of reading, when used with adequate controls, is a more 
valid measure of readability of print than either visibility or 
perception at a distance. Subjective judgments of readability 
can only be considered as an expression of preference which may 
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be employed to advantage in a practical way for the guidance of 
printers when there is a choice to be made between equally 
readable typographical arrangements‘ (p. 20). 


SUMMARY 


1) The purpose of this study is to compare visibility, per- 
ceptibility at a distance and speed of reading as measures of 
readability for materials printed in ten type faces. 

2) Visibility was measured with the Luckiesh-Moss Visibility 
Meter, perceptibility at a distance by means of an optical bench, 
and speed of reading by means of standardized reading tests. 
Reader opinions of legibility are also cited. 

3) Thirty-six university students served as subjects for visi- 
bility measurements. Data from published reports are cited 
for the other measures. 

4) Although there were marked individual differences present, 
the trends froin subject to subject were consistent. 

5) Differec:.ces between the type faces are more striking in 
terms of visibility, perceptibility and reader preferences than for 
speed of reading. 

6) Measures of visibility and perceptibility corresponded 
moderately well except for Cloister Black type face. Neither of 
these measures agrees with speed of reading. Readers prefer- 
ences agree best with perceptibility at a distance. 

7) Analysis of the results, taking account of the normal reading 
situation, perceptual habits in reading and practicality, indicate 
that speed of reading when adequately controlled is the most 
valid of these as a measure of readability. 

8) Limitations of the measures are indicated. 
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A MULTIPLE-FACTOR ANALYSIS 
OF THE CHARACTER TRAIT 
INTERCORRELATIONS PUBLISHED BY 
SISTER MARY McDONOUGH 


HUBERT E. BROGDEN 


Personnel Research Subsection, The Adjutant General’s Office 


INTRODUCTION 


It is usually considered desirable that scientific investigations 
be designed in such a manner that definite hypotheses might be 
tested. In factorial analyses, this implies development of 
plausible theories concerning the nature of the resultant factors, 
and the consequent selection of the measurements to be factored, 
so that vanishing loadings of three or more variables determine 
the locations of each primary trait vector in relation to each of the 
remaining reference axes. However, in the field of personality 
the empirical evidence necessary for the formulation of such 
hypotheses is exceedingly scarce. Moreover, the labor involved 
in the collection of data for factor analysis is so very considerable 
that it would seem desirable to avoid purely exploratory studies. 

Valuable preliminary evidence might well be obtained by 
factoring data that are at present available in the literature. 
Many such data have been gathered with extreme care and at the 
expense of considerable labor. More elaborate analyses of these 
data should provide with a minimum of labor evidence which 
would enable the formulation of hypotheses and the more 
fruitful planning of later and more extensive factor studies. 

The present study is intended to serve this purpose. Since 
the procedure of Sister McDonough’s The Empirical Investigation 
of Character seemed to promise factor results of greater than 
ordinary significance, her published character trait intercorre- 
lations were chosen for this factor study. 

It is essential, first of all, to review certain pertinent details 
regarding Sister McDonough’s investigation. Fifty pupils of a 
Catholic parochial school comprised the population of this rating 
study. The mean chronological age of these fifty pupils was 
156 months, the standard deviation being fifteen months. There 
were twenty boys and thirty girls. Ratings were made on each of 
thirty-four characteristics by three teachers, all of whom had 
frequent contact with the children. If the ratings of one judge 
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were found to correlate less than .6 with those of the other two 
judges, ratings for that judge on that item were discarded, and 
if the intercorrelations of all judges fell below .6, that item was 
discarded. Consequently, the average intercorrelation of the 
ratings of different judges was above .6 for every trait, the mean 
of these average intercorrelations being .72. Reliabilities of the 
ratings on any particular characteristic, as predicted from the 
intercorrelations of subtraits by the Spearman-Brown formula, 
were all above .96. A correlation of —.38 between Woodworth- 
Cady Questionnaire scores and ratings on stability, one of .36 
between social ability ratings and a questionnaire measure of 
interest in games, play and amusements, and one of .67 between 
ratings of intelligence and intelligence test scores, provide some 
evidence regarding the validity of these ratings. 

In an attempt to increase the validity and reliability of her 
data, Sister McDonough employed a rather lengthy procedure. 
Lists of subitems assumed to be characteristic of each trait were 
assembled and a week’s observation devoted to the ratings of the 
subitems of that one trait. These subitems were designed in 
order that the obtained indexes might resemble, as nearly as 
possible, behavioral observations. 

It was important to determine whether subitems of any trait 
clearly possessed more than one common factor. Hence, the 
subitems for each trait were intercorrelated and the presence 
or absence of hierarchies decided by examining the resultant table. 
Inter-columnar correlations were sometimes computed. Because 
of the evidence so obtained, scores on many behavior items were 
discarded. Consequently, if a trait score was finally included, 
its subitem intercorrelations approximated a hierarchy. 

Although the ratings of Sister McDonough’s investigation were 
determined with unusual care, it is doubtful whether the influence 
of such distorting factors as the halo effect were entirely elimi- 
nated. The author does not, however, consider this character- 
istic distortion of rating data a fatal defect. If, in drawing 
conclusions from such data, one always considers their nature, 
errors in generalization may be avoided. When multiple-factor 
methods are applied to rating data, the more analytical nature of 
the obtained results should not only aid in avoiding false generali- 
zation from these results, but should also increase understanding 
of the nature and effect of these distorting influences. 
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It should be remembered in this connection that the process of 
rating is an extremely important psychological function and that 
a description of the primary traits in a set of ratings has con- 
siderable interest in its own right apart from any deductions that 
one might be willing to make as to the relations between the 
ratings and the actual behavioral characteristics of the subjects. 
It is realized that a more analytic approach to the problem of the 
characteristics of the rating process as such could be made if both 
ratings and measures of these same characteristics obtained 
through other mediums were included in a single analysis. The 
present analysis was, however, necessarily limited to the data as 
published by Sister McDonough. 


PROCEDURE AND RESULTS 


The procedure of the analysis involved simply the application 
of multiple-factor methods to the intercorrelations reported by 
Sister McDonough. After the extraction of four centroid factors 


TABLE I.*—Tue TRANSFORMATION MATRIX 


I II III IV 
l 219 439 063 074 
2 — 275 288 + —542 652 
3 — 572 501 — 408 — 573 
4 —74l 687 732 491 
TABLE 2.*—THE INTERCORRELATIONS OF THE PRIMARY FACTORS 
l 2 3 4 
l 766 027 — 024 
2 — 050 —172 
3 —211 


* Decimals normally preceding each entry have been omitted. 


the residuals were symmetrically distributed and had standard 
deviation of .051. This is considerably below .141 or the SD of 
an original coefficient of zero. Rotations were then made in 
order to maximize the number of zero loadings and thus to mini- 
mize the number of factors involved in each test and the number 
of tests involved in each factor. Although a fifth centroid factor 
was extracted, no significant loadings were obtained on it, and 
since no improvement in the simple structure resulted from the 
further rotations with this fifth factor it was discarded. The 
transformation matrix is presented as Table 1, the correlations 
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TABLE 3.*—TuHeE ROTATED FACTORIAL MATRIX 


Will 

Attentiveness 
Truthfulness 
Reliability 

Attitude toward work 
Self-control 


Response to reproof 
Obedience 


. Respectfulness 
10. 
11. 
12. 
13. 
14. 
15. 
16. 
17. 
18. 
19. 


Generosity 

Stability 

Religiousness 

Refinement 

Lack of contentment 
Independence 

Self-consciousness 

Lack of cheerfulness 

Neatness 

Lack of tendency to be sympa- 
thetic 

Intelligence 

Orderliness 

Lack of tendency to be affec- 
tionate 

Tendency to be active 

Lack of humor 

Lack of sociability 

Lack of credulity 

Lack of expressiveness 

Lack of tendency to look for sym- 
pathy 

Lack of conceit 

Lack of quarrelsomeness 

Lack of irritability 

Lack of impulsiveness 

Lack of emotionality 

Lack of forwardness 


I 
452 
181 
064 
492 
606 
062 
329 
548 
471 
050 
143 
038 
096 
—181 
318 
143 
— 098 
441 


— 068 
—021 
207 


— 049 
— 058 
147 
058 
074 
— 002 


—155 
— 147 
—101 
— 060 

075 

187 
— 084 


IT 
112 
382 
524 
076 
— 055 
533 
242 
025 
088 
401 
435 
430 
399 
— 239 
101 
351 
—148 
—091 


—110 
218 
076 


160 
— 196 
088 
285 
109 
482 


700 
637 
645 
587 
497 
417 
690 


Il 
137 
308 
312 
302 
140 
007 

— 025 
028 
—019 
223 
— 061 
158 
249 
369 
521 
—214 
—215 
— 153 


148 
842 
031 


—075 
646 
— 286 
—177 
802 
— 326 


192 
— 135 
—016 
—117 
— 041 
— 137 

069 


* Decimals normally preceding each entry have been omitted. 





IV 
—118 
073 
033 
— 039 
— 065 
187 
— 146 
015 
— 154 
— 355 
123 
— 246 
— 155 
515 
065 
434 
6109 
139 


686 
— 028 
323 


776 
— 269 
421 
682 
267 
384 


157 
115 
— 047 
— 148 
243 
199 
154 
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between the primary traits as Table 2, and the rotated factorial 
matrix as Table 3. Names of each of the thirty-four traits are 
listed with the rotated factor loadings. 

A list of factor loadings which exceed .4 will precede the dis- 
cussion of each factor. This list is the first and most unequivocal 
definition of the factor. In order, however, to develop a more 
abstract statement of the element common to the listed traits, 
hypotheses concerning this element will be formed. These 
should indicate the nature of further traits expected to be highly 
loaded with the given factor. Ideally, the hypothesis should be 
worded so as to predict all traits which will be loaded on the 
factor and, conversely, all those not involving the common 
element. 

It should be understood that any names applied to the primary 
traits are symbols whose full meaning is to be found in the listed 
traits or stated in the various hypotheses. Excepting con- 
venience, naming adds little of value to the definition provided 
by the list of variables high on the given factor. Names some- 
times applied to factors are apt to be misleading since, because of 
conventional meaning, the terms carry implications not justified 
by the data, and not likely to be verified by further investigation. 


TABLE 4.—Factor I 


Variable Loading Trait 
5 .606 Attitude toward work 
8 .548 Obedience 
4 .492 Reliability 
9 .471 Respectfulness 
1 .451 Will 
18 .441 Neatness 


Before an interpretation of Factor I is attempted, it will be 
helpful to explain more fully the nature of a number of the 
variables of the above list, since the name assigned to some of 
them does not fully and unambiguously describe the subitems 
which determined thescores on that variable. Thus ‘will’ consists 
of such subitems as persistence in school work, attentiveness in 
spite of distractions and the keeping of resolutions; ‘attitude 
toward work’ is concerned with subitems such as diligence in 
school work; ‘respectfulness’ involves subitems having to do with 
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politeness and courtesy; while ‘reliability’ has to do with the 
preparation of home work, the keeping of promises, honest scoring 
of papers, and diligence during study periods wherein supervision 
is absent. Both ‘obedience’ and ‘neatness’ need little added 
explanation although it should be noted that both are concerned 
in part with specific happenings in the classroom situation. 

The variables listed in Table 4 as thus elaborated are seemingly 
characteristic of a child who is ‘on his good behavior.’ The 
primary trait here involved appears to be the tendency of the 
child to react positively to the teacher and the school situation 
and hence to exhibit those traits thought by him to be socially 
desirable in that situation. Consequently Factor I will be 
referred to as a measure of goodness of classroom behavior. 

It should be emphasized here that all of these characteristics 
high on Factor I heavily involve interaction between teacher and 
pupil and in addition are direct objectives of classroom discipline. 
The child soon learns, and is directly and continually reminded, 
that he should be diligent in his classroom work, and that he 
should be obedient, reliable, respectful, and neat. 

Examination of Table 3 reveals three variables; namely, 
‘attentiveness,’ ‘response to reproof,’ and ‘orderliness,’ whose 
absence from the above list is possibly inconsistent with the 
hypothesis just stated. However, it will be noted that all of 
these variables have loadings sufficiently close to .4 that the 
deficiency may well be due to sampling error. 

In the foregoing statement as to the nature of Factor I there is 
the definite implication that the behavioral characteristics in 
Table 4 are situational in nature and not general traits. This 
does not mean, however, that the behaviors involved may not 
show consistencies from one situation to another. Prediction of 
the behavior in a particular social situation may depend upon 
knowledge of two variables; namely, positive-negative reaction 
to the given situation and the patterns of behavior characteristic 
of the positive and the negative response. Of course, it is still 
possible that the tendency to exhibit either the positive or nega- 
tive pattern may itself be a general trait. It should be pointed 
out that these foregoing points are in no sense proven by the data 
here presented. The further possibility of lack of consistency in 
both traits and patterns is probably equally as plausible as those 


suggested. 
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, TaBLeE 5.—Factor II 
Variable Loading Trait 
28 .700 Lack of tendency to look for sympathy 
34 .690 Lack of forwardness 
30 .645 Lack of quarrelsomeness 
29 .637 Lack of conceit 
31 .587 Lack of irritability 
6 .533 Self-control 
3 .524 Truthfulness 
32 .497 Lack of impulsiveness 
27 .482 Lack of expressiveness 
1] .435 Stability 
12 .430 Religiousness 
33 .417 Lack of emotionality 
10 .401 Generosity 


Several of the variables listed in Table 5 are in need of some 
elaboration. ‘Forwardness’ is the sum of ratings on lack of 
timidity, lack of modesty, lack of tendency to be retiring, and 
continual desire to recite or run errands. ‘Expressiveness’ is 
the sum of subitems such as the following: tendency to be easily 
moved to anger or grief, to recover composure quickly, to get 
excited over trifles, to become confused easily, not to be stiff or 
sedate, to give vent to feelings without restraint, to be spon- 
taneous, not to be quiet or calm, and not to be certain of one’s 
self. ‘Generosity’ involves such subitems as tendency to think 
of others, not to expect others to wait on him, to offer help, forgive 
those who hurt him, and to be unselfish. 

When these elaborations are kept in mind the variables high 
on Factor II all seem to involve the tendency to inhibit behavior 
which might be described as direct or natural and at the same 
time as egocentric and socially undesirable. Hence, the name 
sensitivity to social disapproval will be assigned to Factor IT. 

{xamination of Table 3 reveals no variable whose negative or 
near zero loading on Factor II is inconsistent with the hypothesis 
just stated, with the possible exception of those variables loaded 
highly on Factor I. It might be noted in this connection that 
‘attentiveness,’ ‘self-consciousness,’ and ‘refinement’ have load- 
ing on Factor II between .3 and .4. 

The loading of ‘self-consciousness’ on Factor II together with 
the presence of certain subitems in both ‘forwardness’ and 
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‘expressiveness’ and the general nature of the variables listed in 
Table 5, suggest the possible identity of Factor II and the factor 
which has been isolated by both Guilford’ and Mosier? and called 
‘shyness’ or S by the former. The hypothesis of sensitivity to 
social disapproval advanced in explaining Factor II will explain 
S very nicely. While Guilford in the analysis just mentioned 
isolated a factor which he regarded as emotional in nature, and 
which might consequently be thought to resemble certain of the 
variables in Table 5, this resemblance is in the author’s opinion 
superficial. Furthermore, in his more recent analysis® items such 
as ‘not impulsive’ and ‘does not crave excitement’ were loaded 
on S. Except for the obvious differences between personality 
questionnaire items and ratings, the first of these items is identical 
with variable number 32 in Table 5. Very possible, although the 
tendency to become emotional is unrelated, the tendency to 
inhibit emotions in social situations is related to S and to Factor II. 

A study by Symonds!” is of some interest in this connection in 
that he reported a number of characteristics rather similar to 
those highly loaded on Factor II as being most closely related to 
the variable of lax-severe home discipline. A genetic explanation 
of Factor II and possibly of S is thus suggested. Severe home 
discipline produces sensitivity to disapproval through condition- 
ing processes, and generalization to other situations occurs. 
Individuals so conditioned exhibit socially desirable behavior in 
order to avoid disapproval and become embarrassed when their 
behavior is disapproved. Because of their concern over the 
possibility of disapproval and their anticipation of it, they tend 
often to inhibit responses which they have reviewed implicitly or 
hesitate until the time for action has passed and thus remain 
quiet and reserved in social situations. 

It is realized that factor results should be regarded as suggesting 
but by no means supporting theories such as those just discussed. 
Factor results are derived from data gathered under non-experi- 
mental conditions and bear no necessary relation to problems of 
causation. | 

It will be noted in Table II that the only entry of significance is 
the correlation of .766 between FactorsI andII. The magnitude 
of this coefficient may be explained in terms of: (1) the saturation 
of socially desirable classroom attitudes with the general trait of 
sensitivity to social disapproval; (2) the tendency for ratings of 
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the characteristics highly loaded on Factor II to be concerned 
with behaviors occurring in the classroom situation; and (3) the 
operation of the halo effect. Since all three of these presumably 
may affect both Factors I and II, it is not possible to isolate the 
influence of any one of them. In connection with the second 
point it must be remembered that while the characteristics high 
on Factor II are observed by the teacher in the playground and 
other situations where the interaction is between children, he very 
probably weights more heavily those incidents in which he 
himself is involved. The fact that the variables loaded on both 
of these factors have to do with social desirability in itself strongly 
supports the statement that the halo effect would operate to 
increase the correlations between these two factors. 

In this connection it will be remembered that the well-known 
study of Wickman!* seems to indicate that teachers as a class 
tend more than most persons to regard the traits listed in Tables 4 
and 5 as being socially desirable. Wickman found that teachers 
take orderliness in classroom and application to school work 
rather seriously, but think of the withdrawing and recessive 
personality as a minor problem. Since the high correlation 
between Factors I and II demonstrates a relationship between the 
withdrawing personality and a desirable classroom attitude, we 
have a possible explanation of the teacher’s tendency to disregard 
the problem of the withdrawing personality. 


TABLE 6.—Factor III 


Variable Loading Trait 
20 .842 Intelligence 
26 .802 Lack of credulity 
23 .646 Tendency to be active 
15 .§21 Independence 


Factor III seems to heavily involve g together with character- 
istics highly associated with general intelligence and/or with 
characteristics which give to the rater the impression of general 
intellectual ability. In order to isolate as a separate factor 
characteristics other than g which influence teachers in rating of 
intelligence it would have been necessary to have included in the 
analysis several objective measures of g as well as ratings of 
intelligence and personality characteristics. It will be remem- 
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bered from the introduction that tested intelligence correlated 
.67 with ratings of intelligence. 

While it may seem inappropriate to speak of g in connection 
with multiple-factor results, it must be remembered that at least 
two recent analyses':!* have revealed important general factors. 
It is realized, of course, that the general factors obtained in 
these studies are subject to other interpretations. Since, also, 
other primary mental abilities may well be involved in Factor 
III, the evidence does not permit any very positive identification 
of this factor with g. 

Of interest is the considerable loading of ‘independence.’ 
Unpublished data of the author’s revealed, in a small sample, a 
correlation of —.60 between the F-2 score of the Bernreuter 
Personality Inventory and grades in an elementary psychology 
course. Carr’ reports a correlation of .38 between the Bernreuter 
self-sufficiency scores and intelligence test scores. Self-suffi- 
ciency, lack of sociability, and independence seem subjectively 
to have much in common. It is hoped that further study will 
throw light upon the exact nature of the relationship between 
these personality traits and mental abilities. At present, little 
can be said beyond the assertion that there is some correlation 
between abilities involved in intelligence tests and certain aspects 
of personality related to independence or self-sufficiency. 


TABLE 7.—Factor IV 


Variable Loading Trait 
22 .776 Lack of tendency to be affectionate 
19 .686 Lack of tendency to be sympathetic 
25 .682 Lack of sociability 
17 .610 Lack of cheerfulness 
14 .515 Lack of contentment 
16 .434  Self-consciousness 
24 .421 Lack of humor 


Garnett’s C or ‘cleverness’ factor which was isolated from 
Webb’s data with Spearman’s single-factor technique bears 
considerable resemblance to Factor IV. The two variables 
having highest loading in C are ratings on ‘general tendency to 
be cheerful’ and on ‘degree of sense of humor.’ If, on this 
basis, we tentatively assume identity of these two factors, it 
might be of interest to list further characteristics correlating 
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highly with C and thus presumably correlating highly with 
Factor IV. These variables are: (3) ‘quickness of apprehension,’ 
(4) ‘originality of ideas,’ (5) ‘intensity of his influence on his 
special associates,’ and (6) ‘fondness for large social gatherings.’ 
By way of warning, however, two differences between Garnett’s 
and the present study should be noted; namely, the difference in 
mean age of the subjects (those involved in Garnett’s analysis 
being several years older) and the differences between the Spear- 
man’s single-factor technique employed by Garnett and the 
multiple-factor technique employed in the present study. 

If the previous suggestion concerning the identity of Factor II 
and S is correct, it would appear that Factor IV, being unrelated 
to Factor II, is also unrelated to S. In this connection it should 
be mentioned that a primary trait bearing some resemblance to 
Factor IV has been isolated by Layman’ and by Brogden and 
Thomas? and named ‘gregariousness’ by the latter authors. 
This factor, which seems primarily concerned with liking for 
companionship and social gatherings, was in Layman’s analysis 
independent of items representative of S. 


DISCUSSION 


The finding that several of these factors seen to be situational 
in nature might seem discouraging to those interested in factor 
methods in that it suggests limited generality and unlimited 
numbers of such primary personality traits. If this were true, 
but little economy in description could be expected to result from 
use of such methods. However, while the reaction to situations 
of little interest could undoubtedly be described in terms of 
primary factors and the number of such primary factors would 
possibly be very large, it is still true that many situations recur 
so often that knowledge of an individual’s adjustment to them is 
of importance. Certainly the description of an individual’s 
adjustment to the school situation in terms of one, or for that 
matter several unidimensional traits, would be a significant 
contribution. Furthermore, situational responses may tend to 
generalize; as, for example, those involving authority or, to take 
an even more extreme example, social responses. 

In the author’s opinion, the factor analysis of any response 
tendencies due in large part to learning should, logically, tend to 
result in situational factors. If personality traits are learned, 
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any theory of the manner of their acquisition must assume 
functional relationships between situations and response tend- 
encies. When response tendencies are learned and are hence 
responses to situations, it would seem to follow that similar 
responses would be made to similar situations because of gener- 
alization and transfer and, hence, that there would be correlation 
between these response tendencies occurring in similar situations, 
and that after factoring situational traits would be revealed. 
This, of course, involves the assumption, which is reasonable, 
that there would be individual differences in the response tend- 
encies thus acquired. 

One possible reason for the tendency to disregard situational 
factors in the description of personality lies in the fact that the 
trait names have in popular writing and even in psychology no 
reference to the situation. Thus in most rating studies traits are 
rated in general with no attempt to define situations in which 
they appear. Since situations were not defined and measure- 
ments were not obtained separately for different situations it 
was natural that no situational factors were found or suspected 
in the data thus gathered. 

Even though situational responses and factors resulting there- 
from are judged in many cases to be of little direct interest, they 
may possibly be of considerable importance in obtaining accurate 
measurement of general traits when the variance of the measure- 
ments loaded on the general traits are in good part accounted for 
by these situational factors. Situational factors would in such 
instances aid by suppressor action; that is, they would make 
possible measurement of variance which would otherwise have 
to be regarded as error variance. When such variance can be 
accurately and independently measured, its effect on the measure- 
ment of the primary trait can, of course, be eliminated through 
the negative beta weights which will be obtained for these sup- 
pressor variables in multiple regression analysis. 

This suggestion appears to have general implications for factor 
work. It is not improbable that situational factors are often 
present in the data of factor studies in other fields but are not 
isolated because all of the variables defining such factors have 
still further factors in common. Factors of this sort might result 
if a method of administration is common to several tests. They 
could presumably be isolated if the study is properly designed, 
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and contributions to predictive efficiency might then result for 
the reason mentioned in the above paragraph. It would appear 
desirable, then, to design a factor study so that variables thought 
to measure a common characteristic measure it under conditions 
and through methods as divergent from each other as possible. 
For the same reasons, if it is thought that ratings or character- 
istics measured in single situation, or abilities measured by a 
single type of test (such as speed, group or individual tests) have 
common characteristics, an attempt should be made to isolate 
this common characteristic as a separate factor by including a 
number of variables measuring different abilities or traits through 
the same mediums even though the assumed common factors 
have little direct interest. Such an experimental design should 
in addition materially aid in the isolation of simple structures 
and thus avoid difficulties of interpretation such as those encoun- 
tered in the present study. 

It should possibly be noted that the idea here expressed is 
directly opposed in its practical implications to the tendency on 
the part of most investigators to look for variables which are 
factorially pure. It appears to the author that recognition of 
necessary impurity of variables and redirection of the search 
toward the finding of narrow group factors in what was formerly 
regarded as specificity by procedures such as those just discussed 
has considerable promise. 
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THE RELATIONSHIP BETWEEN 
VOCABULARY LEVELS AND LEVELS OF 
GENERAL INTELLIGENCE IN PSYCHOTIC 
AND NON-PSYCHOTIC INDIVIDUALS 
OF A WIDE AGE-RANGE 


ALBERT I. RABIN 
New Hampshire State Hospital 


INTRODUCTORY 


The usefulness of a well-standardized vocabulary test as an 
indicator of the level of general intelligence has been pointed 
out on numerous occasions. Terman’ says in the first revision 
of the Stanford-Binet test that ‘“‘the vocabulary test has a far 
higher value than any other single test of the scale—it probably 
has a higher value than any three other tests in the scale (p. 230).”’ 
More than twenty years later Terman and Merrill’ state as 
follows: 


We have found the vocabulary test to be the most valuable single 
test of the scale—(it) gives the examiner a rapid survey method of 
estimating the subject’s ability. It agrees, to a high degree, with the 
mental age rating of the scale as a whole; correlations for single age 
groups range from .65 to .91 with an average of .81 (p. 302). 


The statements quoted above are based on results obtained 
from children. Similar findings, however, were given by Roe 
and Shakow‘ who surveyed the relationship in normal as well as 
abnormal adults. The coefficients of correlation between 
vocabulary and MA (old Binet) ranged from .76 in psychoneu- 
rosis to .92 in paranoid patients with the normal group showing 
an r = .82. 

Another interesting aspect concerning the vocabulary level is 
its alleged relative stability. While other intellective factors 
may be affected by increasing age or the onset of a psychosis, 
the vocabulary remains comparatively untouched. The well- 
known studies of Babcock? and of others using her tests prove to 
the satisfaction of a great many investigators the efficacy of the 
vocabulary level as an indicator of the ‘potential level’ of 
intelligence in schizophrenia, paresis, and other mental diseases. 
The discrepancy between the potential level, based on vocabulary, 
and the actual mental level, based on a battery of miscellaneous 
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tests of the type requiring new learning, making new associations, 
etc., yields an ‘efficiency index,’ according to Babcock. The 
discrepancy is close to zero in the normal individual and increases 
with the severity of the condition in the psychotic. The benign 
psychosis and psychoneurosis show comparatively little ineffi- 
ciency, while the more malignant mental conditions show it to a 
high degree. 

Increasing age, even under normal conditions, causes a reduc- 
tion in the MA or, operationally speaking, in intelligence test 
achievement. It is due to the ‘normal process of gradual 
deterioration’ due to advancing age. The most up-to-date 
intelligence scale for adults, the Wechsler-Bellevue,® allows for 
such a reduction in its system of IQ computation. However, a 
study of the vocabulary (old Binet) in a representative sample of 
two hundred normal adults grouped in several age groups shows 
that the vocabulary level remains practically unchanged, through- 
out the average life-span. Shakow and Goldman’ conclude as 
follows: 


When thus equated indirectly for mental level, vocabulary score was 
found to remain constant at a level of about fifty-seven words to the 
seventh decade with a slow decline thereafter (p. 254). 


Most of the studies dealing with the relationship between 
vocabulary level and general intelligence have employed the 
Binet as their yardstick. The comparison is justifiable at the 
chronological ages for which the Binet was devised. It doesnot 
hold very well for adults whose total MA may naturally drop 
with age, while the vocabulary may remain constant. The new 
Binet vocabulary has not been used in any of the studies. 

Moreover, the work with psychotics indicating the dis- 
crepancies between vocabulary levels and general intelligence 
levels does not take, in the case of intelligence levels, the age 
factor into consideration. Thus, higher discrepancies may be 
expected in the later decades of life. 


PURPOSE 


Considering the points stressed in the forgoing paragraphs, it 
appeared that a comparison between the new Stanford-Binet, 
Form L vocabulary, and a well-standardized adult intelligence 
scale such as the Wechsler-Bellevue, in psychotic and non- 
psychotic adults of a wide age range, may yield some interesting 
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as well as useful data. An answer to the following questions 
will, therefore, be attempted: 

1) How well may the Form L vocabulary serve as an indicator 
of general intelligence in adult subjects? 

2) What relationship is maintained between the vocabulary 
level and general intelligence through the six decades from 
adolescence to senility? 

3) Does the degree of the vocabulary’s efficacy as a measure 
of general intelligence differ in the several diagnostic groups? 

4) How do the discrepancies between the vocabulary level and 
level of general intelligence differ in several psychopathological 
diagnostic groups? 


TESTS AND SUBJECTS 


The Form L vocabulary of the 1937 Stanford-Binet Revision 
and the Full Scale of the Wechsler-Bellevue test were individually 
administered to two hundred sixty-eight. persons admitted to 
the New Hampshire State Hospital. Ninety patients of this 
group were diagnosed as schizophrenic; eighteen as manic 
depressive; twenty-four as psychoneurotic; nine luetics; and 
fifty-two cases were designated as ‘without psychosis.’ The 
remainder of the two hundred sixty-eight individuals were placed 
in several other diagnostic classifications; there were too few in 
each diagnostic group to be treated separately. 

A wide age range was represented in this sample of the patient 
population. The youngest patient was fifteen and the oldest, 
sixty-nine. There were twenty-five cases under twenty years of 
age (second decade of life), fifty-four in the third decade, seventy- 
six in the fourth decade, and fifty-two, forty-four and seventeen 
in the fifth, sixth and seventh decades, respectively. 

There was no special selective factor in operation. All of 
these hospitalized individuals were examined by means of the two 
tests shortly after their admission, regardless of diagnosis. 
Satisfactory examinations only were included. The results of 
individuals with a marked language handicap due to a bilingual 
background were not included in this study. 


PROCEDURE AND RESULTS 


The first step in the treatment of the data was to correlate the 
Form L vocabulary with the full scale IQ of the Wechsler- 

















aime. Sem 





414 The Journal of Educational Psychology 


Bellevue. The total number of words passed on the vocabulary, 
rather than the corresponding MA, was used as the indicator 
of the mental level. The coefficient of correlation for the 
entire heterogeneous group of two hundred sixty-eight patients is 
.78. It assumes a median position among the r’s resulting from 
the correlation of the vocabulary with the total Binet reported 
by Terman and Merrill* and others. Moreover, the relation- 
ships reported are present in children; it is reasonable to assume 
that the correlation would not beso high among adults. Mitchell? 
checked this point with adult university students and found a 
correlation of only .63 between the vocabulary of the 1937 
revision and the full Binet Scale. The obtained correlation, 
(.78), therefore is quite high—among the highest reported in 
adult subjects. The mean vocabulary score (number of words) 
for the entire group of two hundred sixty-eight is 20.1. This is, 
according to the Revised Stanford-Binet, the average adult level 
(twenty words). However, the corresponding mean IQ on the 
Wechsler-Bellevue test is 89.7, which can, at best, be described 
as low average. Ideally, the corresponding mean IQ should also 
be the average—100. The corresponding verbal and _per- 
formance IQ’s are 92.1 and 93.3, respectively. It appears, 
therefore, that the mental level obtained on the basis of the 
vocabulary is somewhat overestimated. Roughly speaking, 
the difference is about ten IQ points. It is certainly not due toa 
verbally superior population, since the mean performance IQ 
is even slightly higher than the mean verbal IQ on the Wechsler- 
Bellevue scales. Spache® even in a selected group of children 
in a private school found a similar overestimation, when the 
vocabulary MA was compared with the total Revised Binet MA. 
He summarizes his findings as follows: 


In our population, underestimation or overestimation of scale MA 
when using the vocabulary MA was as great as three years, in some 
cases. Underestimation of scale MA by approximately one year 
occurred in fourteen per cent of the cases. Overestimation of similar 
amounts occurred in forty-nine per cent (513). 


Thus, it appears that, even with children, the predominant 
tendency is that of overestimation of the MA by means of the 
vocabulary. 
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STANFORD-BINET VOCABULARY AND GENERAL INTELLIGENCE 


In Table 1 the correlational results at the various age levels 
(from second through seventh decade) may be found. As was 


TABLE 1.—VOCABULARY SCORES CORRELATED WITH THE 
WECHSLER-BELLEVUE 
Below 20-29 30-39 40-49 50-59 60-69 
20 

Total Score . 69 .85 81 .78 . 84 .70 
Verbal Score .88 87 .80 77 71 72 
Perf. Score .48 .76 . 64 . 56 . 56 .42 
Total IQ a 86 .82 .76 .85 .34 
N= 25 54 76 52 44 17 


expected, the verbal scores correlated highest with the vocabulary, 
while the performance scores of the Wechsler-Bellevue show the 
lowest correlations. The ‘total scale,’ which is a combination 
of both the performance and verbal scales, offers correlations 
which occupy the middle position between the high r’s of the 
verbal and low r’s of the performance scale. There appears no 
definite trend in the ‘full scale’ or performance scale correlations; 
but the verbal scale coefficients of correlation show a slight but 
consistent decline with increasing age. The total scale [Q’s were 
also correlated with the verbal level. No definite trend can be 
noted in this group of coefficients of correlations. They are all 
fairly high, favoring significantly no special age group. 

However, the correlations do not tell the entire story. In 
order to have a more detailed analysis of the results, Table 2, 


TABLE 2.—VocABULARY LEVELS ComMPpARED WitH W-B IQ 


AVERAGES 

Age Average Total Verbal Perfor- 
Group SBL Voc. W-BIQ IQ mance IQ 

Below 20 ~—i17..4 87.7 86.1 91.7 

20-29 19.7 86.3 89.5 83.1 

30-39 20.5 89.2 90.5 88.7 

40-49 21.3 96.6 95.0 98.8 

50-59 20.2 93.1 95.3 106.4 


60-69 24.6 98.9 101.2 101.1 








—* 
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giving means of vocabulary levels and IQ’s, is offered. Several 
interesting trends become readily apparent upon an examination 
of these results. In the first place, there is an increase of the 
vocabulary level with age. It is as low as 17.4 words in the 
first age group. It fluctuates around the twenty-word mark in 
the next four groups and then jumps to the twenty-four-word 
level in the old age group. The IQ (Wechsler-Bellevue) trend is 
roughly similar in its rise with the age group. 

A special method* for calculating the equivalent Binet IQ 
from the vocabulary level was employed in order to have com- 
parable IQ figures. The MA figure of 15-4 was taken as the 
equivalent of twenty words (average adult level). The MA’s 
above 15-4, based on vocabulary levels above twenty words, 
were interpolated by means of the corresponding number of 
months on the scale. Then, the corresponding IQ’s were 
obtained from the Terman tables, using 16 as the minimum 
chronological age for adults. This method was devised in order 
to compare the IQ discrepancies between the Wechsler-Bellevue 
Scales and the equivalent IQ’s based on the Binet vocabulary. 


TABLE 3.—MEAN AND MeEpDIAN DISCREPANCIES BETWEEN 
‘OBTAINED’ (BINET VOCABULARY) AND WECHSLER-BELLEVUE 


IQ’s 
Below 

20 20-29 30-39 40-49 50-59 60-69 
M. Disc 
SBL > WB 6.7 20.0 17.9 13.2 19.0 25.2 
Mdn Dise 
SBL > WB 4 18 17.5 15 13 30 
N 
SBL > WB 14 47 61 38 33 15 
N 
SBL = WB 11 7 15 14 11 2 


Table 3 shows, in the first line, the mean positive discrepancies 
between the larger ‘obtained’ (SBL vocabulary) IQ and the 
Wechsler-Bellevue IQ’s. These discrepancies have been indi- 
vidually calculated and then averaged. The medians are shown 
in the second line. The next two lines indicate the number of 





* The author is indebted to Margaret H. Sanderson for her assistance on 
this part of the study. 
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cases showing a larger obtained than Wechsler-Bellevue IQ and 
a smaller obtained or equal to the Wechsler-Bellevue IQ. The 
positive discrepancy in favor of the vocabulary IQ is present at 
all age levels. However, the youngest group shows a com- 
paratively small discrepancy. It rises from twice to thrice that 
amount in the next four decades and finally to the highest dis- 
crepancy in the last decade—nearly four times as much as in the 
‘below twenty’ group. 


TABLE 4.—VOCABULARY WECHSLER-BELLEVUE CORRELATIONS 
FOR THE SEVERAL HospitTaAL GROUPS 























Total Verbal rertorm- 
ance 

N. 

Score | IQ | Score | IQ | Score | IQ 
Schizophrenia 74 |.75| .79 | .83] .52 | .54} 90 
Manic Depressive .46 | .77| .78 | .79| .54 | .67] 18 
Luetic .746 | .98| .72 | .98| .42 | .94) 9 
Psychoneurosis .66 | .68] .79 | .76) .26 | .24| 24 
Without Psychosis .88 | .86] .85 | .88| .74 | .75| 52 


























A point of even greater interest in the present investigation is 
the relationship between vocabulary level and general intelligence 
in the several diagnostic groups represented. Table 4 presents 
the summary of the correlational results for five major clinical 
groups. The first striking feature is the comparative stability 
in the correlations of the non-psychotic group. All the coeffi- 
cients are quite high. Even the correlation with the perform- 
ance IQ for this group is as high as .75. Contrary to expectation, 
no outstanding differences between the schizophrenic and manic 
depressive groups are to be observed in this case. The correla- 
tions between the vocabulary and the total scale of the Wechsler- 
Bellevue are slightly higher in the manic depressive group, but 
not appreciably so. The correlations with the two parts of the 
scale for both psychiatric classes are similar indeed. The luetic 
group shows high score correlations and, especially high IQ 
correlations. No special significance may be attached to this 
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because of the few cases represented by this group. It may have 
been subjected to a special, though unknown, selective factor. 
Finally, the lowest correlations are found in the psychoneurotic 
group. As may be noted from Table 4, the ‘total scale’ corre- 
lations are considerably reduced due to the extremely low 
correlations between the vocabulary and the performance portion 
of the scale. The results of Roe and Shakow‘ previously men- 
tioned are very similar, especially in respect to the low r for the 
psychoneurotic group. 


TABLE 5.—VOCABULARY LEVELS COMPARED WITH WECHSLER- 
BELLEVUE [Q’s 
Mean Mdn 
Voc. Voc. Total Verbal Perf. 
Level Level IQ IQ IQ 


Schizophrenia 21.4 22.0 89.8 91.9 90.2 
Manic Depressive 25.4 25.5 103.3 104.0 102.6 
Luetic 19.5 20.0 85.1 86.4 86.3 


Psychoneurosis 22.8 25.0 95.7 97.0 94.4 
Without Psychosis 19.3 20.0 92.1 91.6 93.8 


Table 5 presents the vocabulary means and medians as well as 
the IQ means for the several clinical groups. The table shows 
that the manic depressive group has the highest vocabulary level 
while the luetics and non-psychotics have the lowest levels. 
The schizophrenic and psychoneurotic groups occupy a middle 
position. The full scale Wechsler-Bellevue IQ’s follow a similar 


TaBLE 6.—MeEAN AND MEDIAN DISCREPANCIES BETWEEN 
‘OBTAINED’ (BINET VOCABULARY) AND WECHSLER-BELLEVUE 
IQ’s 

Schizo- Manic  Psycho- Without 
phrenic Depress. Neurotic Luetic Psychosis 
Mn Disc 
SBL >WB~ 20.8 21.9 21.8 18.2 13.4 
Mdn Disc 
SBL>WB~ 21.0 27.0 22.5 22.0 14.0 
N 
SBL>WB_ 78 14 20 8 37 
N 
SBL=WB 12 4 4 1 15 
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rank order. The only exception is the non-psychotic group, 
showing a higher mean IQ than the schizophrenic group despite 
the fact that the latter’s vocabulary level is higher. 

Table 6 shows that discrepancies between the ‘obtained’ IQ’s 
based on the vocabulary and the Wechsler-Bellevue IQ’s are very 
large. The smallest discrepancy, as may be expected, is found 
in the non-psychotic group. However, the other discrepancies 
are not, at all, according to expectation. 

All of the clinical groups show a similar mean discrepancy 
of about twenty points. Apparently the discrepancies are 
insignificant in indicating any intra-pathological-group differ- 
ences, though there is sufficient difference between the non- 
psychotic and other hospital groups. It is interesting to note 
that such disorders like psychoneurosis and manic depressive 
psychosis, ordinarily not entailing any large degree of ‘ineffi- 
ciency,’ show discrepancies as large as, if not larger than, the 
‘deterioration’ psychoses of schizophrenia and general paresis. 


DISCUSSION AND SUMMARY 


It appears, from the results obtained, that the Binet, Form L 
Vocabulary is a fairly good indicator of the individual’s relative 
position in a group, when the intellectual level is under con- 
sideration. The coefficient of correlation of .78 between the 
vocabulary and the Wechsler-Bellevue IQ’s is close to the 
coefficients reported by Wechsler® and others!® in whose studies 
the full Binet Scale was administered. It is quite clear, however, 
that the vocabulary overestimates the mental level all along the 
line. Whether the comparison of IQ’s is approximate (based on 
the inspection of the mean of the number of words) or more 
accurate (based on actual computation from the corresponding 
MA’s), the results show a difference of ten to fifteen IQ points in 
favor of the vocabulary. This difference is not clearly a function 
of age, though it seems to be smallest in the adolescent-age group 
and largest in the old-age group. 

A speculatory attempt at explanation may be ventured. In 
the first two decades of life, testable intelligence is a comparatively 
leveled aggregation of abilities, while in the following decades, 
due to advancing age and concomitant process of natural deterio- 
ration, some intellectual aspects, like information, vocabulary, 
etc., remain preserved, or develop to higher levels. The very 
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large discrepancies in the last decade group of the patient popu- 
lation are probably due, to some extent, to the idiosyncrasies of 
IQ interpretation and computation at the higher age levels on the 
Wechsler-Bellevue scales. It is also interesting to note that in 
the ‘below twenty’ group the number of cases whose IQ (Vocabu- 
lary) is equal or less than the Wechsler-Bellevue IQ is almost the 
same as the number with the discrepancy in the other direction. 
However, in the remaining age groups only a small minority of 
the cases show vocabulary IQ’s equal to or smaller than the 
Wechsler-Bellevue IQ’s. Thus, the mean discrepancies are not 
affected by a few large discrepancies; they represent the trend 
in the overweighing majority of cases. The new Binet vocabu- 
lary is especially easy and the corresponding mental ages are high, 
and consequently the discrepancies are so large. Atwell! com- 
pared the new with the old Binet vocabulary and found an 
average difference of 3.81 years in mental age. A discount of 
thirteen IQ points from the obtained IQ based on the vocabulary 
may give a fairly close estimate of an individual’s mental level 
(C.f.: M = 20.1 words and corresponding MIQ = 89.7). 

The most stable relationship between vocabulary level and 
level of general intelligence exists in the nonpsychotic group 
(not counting the too small luetic group). The two major psy- 
chotic groups show very similar coefficients of correlation, while 
the psychoneurotic group shows the most unstable relationship 
between vocabulary level and general intelligence. The strength 
of the relationship in the psychoneurotics is especially broken 
down by the performance portion of the scale. It may be 
wondered whether the different types of psychoneurosis repre- 
sented in this group may be the cause for greater variation in the 
inter-capacity relationships. The time element, which is of 
utmost importance in the performance part of the Wechsler- 
Bellevue Scale, may be modified in a more variable manner in a 
mixed psychoneurotic group than in a schizophrenic group where 
it may show the even effects of retardation and consequently not 
show up in the correlational results. 

Parenthetically, it may be of interest to remark about the 
obviously high vocabulary and general intellectual level found 
in the manic-depressive group as compared with the others. It 
is consistent with the sociological studies indicating a greater 
incidence of this psychosis in the higher socio-economic groups. 
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Finally, the examination of discrepancies between obtained 
vocabulary IQ’s and Wechsler-Bellevue IQ’s only partly confirm 
Babcock’s contentions. The discrepancies in favor of the 
vocabulary tend to be appreciably higher in psychopathological 
conditions than in nonpsychotic individuals. However, one 
fails to distinguish on the basis of our mean discrepancies, 
between the more severe and disabling mental disturbances and 
the milder ones, which supposedly leave the intellect intact and 
well preserved. 

Considering the foregoing paragraphs a few conclusive state- 
ments may be made. 

1) The Revised Stanford Binet vocabulary correlates quite 
highly (r = .78) with the Wechsler-Bellevue test in a heteroge- 
neous group of two hundred sixty-eight psychotic and non- 
psychotic hospital patients. The obtained IQ, based on the 
vocabulary, tends to be, on the average, thirteen points higher 
than the Wechsler-Bellevue IQ. 

2) Fairly high correlations are obtained during the decades of 
life below sixty between vocabulary and general intelligence. 
The relationship breaks down after sixty. 

3) The vocabulary IQ overestimation is smallest in the 
youngest age group (below twenty), keeps fairly constant during 
the next few decades—thirteen to twenty IQ points; and becomes 
highest—25+, in the old-age group. 

4) The stablest relationship between the vocabulary level and 
general intelligence (highest r’s) is seen in the nonpsychotic 
group, while the least stable relationship is present in the psycho- 
neurotic group. The results on the performance part of the 
Wechsler-Bellevue Scale are largely responsible for the com- 
paratively poor correlation. 

5) All of the psychopathological groups show greater IQ 
discrepancies (in favor of the vocabulary) than does the non- 
psychotic group. Little difference in this respect is to be noted 
between the several diagnostic groups. No relationship between 
magnitude of discrepancy and severity of type of mental con- 


dition has been observed from the present results. 
i 
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THE FACTOR OF MOTIVATION IN LEARNING AS 
APPLIED TO THE MAKING OF A TEACHING FILM 


JOHN L. HAMILTON 


University of Minnesota 


It has been stated! that motivation is the sine qua non of 
learning. From this it is logical to assume that in the making of 
a teaching film whose admitted purpose is to aid learning, the 
factor of motivation is, or should be, an important consideration. 
That motivation is not sufficiently considered can be readily 
demonstrated by viewing some of the teaching films that are 
used in the classrooms across the nation. This is not meant to 
be an indictment of all teaching films for even those that are lack- 
ing can be efficiently utilized by the skillful teacher. ‘The point 
to be stressed here is that better films can be made when more of 
the principles of learning are considered in the planning of the 
films. One of these principles is that learning takes place only 
when the subject is motivated: 

Perhaps the failure to consider the factor of motivation lies in 
the newness of the medium of learning through films, but more 
likely it is due to the assumption that the film is so novel in itself 
when presented in the classroom that motivation is automatically 
achieved. This kind of motivation rarely produces satisfactory 
learning. Learning through films will be most efficient when the 
sound projector is as familiar a part of the classroom as the teacher 
or the textbook. Another reason why motivation is dismissed 
as an integral part of the teaching film lies in the tendency to 
leave it up to the teacher. It is true that the teacher can do 
more than the film toward proper motivation, but should not the 
film assist him in this task? It is the purpose of this paper to 
examine seven generalizations of motivation in the light of their 
application to the making of a teaching film. 

To confine this survey to certain limitations: (1) No attempt 
will be made here to give a complete analysis of motivation as it 
applies to producing educational films. (2) The problem of 
utilization of teaching films is considered as apart from the scope 
of this paper. (3) The films listed in a recent survey as being the 
most popular with schools are used as examples wherever 





‘ Arthur I. Gates, et al. Educational Psychology, p. 311. 
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possible.? (4) Educational films are of three distinct types, 
namely: those intended to teach a skill, those which impart 
information or facts, and those whose purpose is to establish 
attitudes. It is perhaps with the latter two types that the 
producer will be most concerned with motivation, for a man 
learns readily to run a lathe or fire a gun because it will directly 
give him a greater income or save his life. 

The source of the generalizations concerning motivation is an 
article entitled “Motivation in Learning’? by David Ryans.’ 
His article lists ten ‘comments’ on motivation. It was found 
that three of these relating to reward and praise, punishment, 
and the factor of the teacher did not apply to the present survey. 
The others provide the divisions for this paper. 


I 


Emphasis on meanings and relationships contribute to the individual’s 
set for learning. 


There are several implications in this statement by Ryans 
that are worth consideration by the producer who is seriously 
interested in making good, usable teaching films. These involve 
the complexity of the picture itself, the sound commentary, the 
camera angles, and lighting of the scenes photographed. 

\ One of the functions of the teaching film is to give meaning to 
new material.) This meaning is conveyed through the media of 
sound and pictures. The sound may be either a part of the scene 
shown or it may be the words of a commentator who explains the 
scenes on the screen. Sound and pictures must obviously 
supplement each other in conveying meanings; they cannot be 
opposed. Some unexplained activity in the pictures can be as 
confusing to the learner as having the commentary mention 
things not shown. 

For the learner to be motivated it is especially necessary for the 
beginning of a film to be meaningful. Scenes technically bad at 





? This survey made by the Bureau of Visual Aids at Indiana University 
showed that the following five films were the most frequently booked films 
in the library: Old Glory, Story of Dr. Carver, Elephants, Heart and Circulation, 
Adventures of Bunny Rabbit. It is interesting to note that three of these 
films are definitely intended for the lower-grade levels. 

* The Forty-first Yearbook, National Society for the Study of Education 
Part II, pp. 289-331. 
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the beginning of a film may easily contribute to confusing mean- 
ings and consequently interfere with efficient motivation. 

The opening scenes of the Erpi film Heart and Circulation are 
an interesting combination of sound and pictures that should be 
cited as an example of a beginning that makes use of this principle 
of motivation. Before the pictures come on the screen, one hears 
the beat of a human heart considerably amplified. The first 
pictures show a stethoscope that is picking up this heart beat as 
it is held over the heart of a young man. The commentary 
begins, ‘“‘ Day and night there operates within the body a marvel- 
ous machine, the human heart.” This is a good opening for it 
starts with that part of the functioning of the heart which is most 
familiar to us. It is a meaningful beginning that undoubtedly 
contributes to the success of this film. From this point it is very 
easy to show through animation how the heart functions starting 
with the valve action that gives sound to the heart’s action. 

Frequently the popular magazines, Life, etc., run a series of 
photographs that were taken at high magnification of familiar 
objects. Usually these familiar objects take on an entirely 
different appearance, and, consequently, different meaning when 
viewed with the camera’s magnifying eye. This is only one 
instance of how the camera can change the meaning of a familiar 
object. Both the angle at which the camera is set and the 
lighting utilized contribute meanings. These meanings may help 
motivation or they may hinder motivation. 

. The interrelationship of the various ideas expressed in a film or 
the relationship of the film material with other material contribute 
to learning set and, therefore, should be important to the film 
producet. The position of each idea conveyed in the film in its 
relationship with the whole unit of knowledge should be made 
clear by any number of devices open to the technician. One of 
these devices is animation. Through it the segment of a large 
map that is to be studied can be clearly pointed out at the intro- 
duction. Portions of a plant, animal, or inanimate object can 
thus be outlined. The technician can make frequent use of 
close-up pictures that assist in conveying meaning. The 
question of whether the close-up or the distant view should come 
first cannot be settled here, but suffice to say that both are 
necessary in conveying meanings and interrelationships and 
should be applied liberally with discretion as an aid to motivation. 
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‘Many films tend to convey the idea that everything to be said 
on a particular subject is contained in this particular film. This 
is unfortunate from the standpoint of motivation if we can 
generalize from Ryan’s statement that a limitation of ideas is 
better than attempting to cover a broad area. It is better to 
cover a few items well than many carelessly. 


II 


Interests, attitudes, and purposes must sometimes be developed, or 
needs created, as a first step in learning. 


According to this principle it would be unwise for the maker of 
fact films to assume that because there is a film made on the 
subject it is universally interesting. Some devices must be 
found whereby the material to be taught is associated with 
attitudes that are known, and known to be favorable. It should 
also be a well-established principle that the primary purpose of 
many teaching films is to establish attitudes. Those that serve 
the latter purpose cannot be expected to be fact films as well. 
Such a film as this is The River. The fact film must motivate 
within itself, while the attitude film motivates for learning 
outside the film. 

In the film entitled Know for Sure, which deals with venereal 
disease, the introduction is made dramatic and occupies about 
five minutes of the whole film, which is twenty minutes in 
length. The sole purpose of this introduction is to point out its 
purpose. It is certainly the first step in learning for the majority 
for whom the film was intended. Another film, Prelude to War, 
whose primary purpose is to teach a point of view and incidentally 
point out the events that led to the present war, opens not with 
the attack on Manchuria in 1931, but with the bombing of 
Pearl Harbor. The events leading up to this point are thus 
made more real since the picture begins with a subject on which 
the viewer has a definite attitude. The problem becomes a more 
difficult one when one considers, for example, the teaching of the 
geography of Canada. Through a quick series of pictures the 
film can show how Canada affects the life of Americans. If the 
film is to be about a particular section of the country, the relations 
can be even more specific.‘ 





‘ A skill film may sometimes require an opening section devoted to estab- 
lishing an attitude toward the material to be learned. Such a film is Jap 
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III 


Goals and standards to be met function successfully as incentives only 
when adapted to pupil ability. 


In order to sell as many prints of a film as possible, film pro- 
ducers are prone to give the impression that the films they make 
can be used at any grade level. This is particularly true of Erpi 
films. For example, the Erpi catalogue of films on physical 
sciences mentions the grade level of these films in only one place 
as follows: ““The following films constitute an excellent survey 
of the physical sciences for use in high schools, junior colleges, 
teacher’s colleges, liberal arts colleges, universities, technical 
schools, and adult education.”’ 

Growing out of this idea of universal grade level for films is the 
matter of the vocabulary used in the commentary, the com- 
plexity of the pictures as well as the scope of the material to be 
covered. In the case of the latter, it has already been mentioned 
that the pupil is best motivated when he has a limited area of 
subject-matter to cover rather than a broad indefinite mass of 
facts to assimilate. Motivation is easiest also when the question 
of just what the pupil is to learn is made clear to him at the outset. 
Very few films make the frank statement of what should be 
learned. This statement, if it is made at all, is saved until the 
end and forms a conclusion. 

A few examples of how some films have either succeeded or 
failed in this respect are in order here. One of the Erpi films 
obviously intended for the lower-grade levels is The Adventures of 
Bunny Rabbit. This film opens with a picture of a hole in the 
ground. The commentator explains that this is a rabbit’s home. 
The next picture then shows a furry mass which is again explained 
by the commentator to be the young rabbits all covered up with 
fur. How much better the opening of this film could have been 
from the standpoint of motivation had the first pictures been of 
the mother rabbit entering the hole. The second picture 
obviously requires the explanation by the commentator, for how 
many young children could appraise the complexity of this 
picture since very few young people would know that the mother 





Zero where the reason for careful identification of this particular plane is 
not at first apparent. 
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rabbit pulls out some of her own fur to cover the young?® Cer- 
tainly motivation to learn could be heightened here by some 
consideration of the complexity of the situation. 

In contrast with this is the Erpi film entitled Elephants which 
is also intended for the lower-grade levels. This film opens with 
the very simple picture of the head of an elephant. The com- 
mentary begins: “‘Isn’t this a strange face?’’ Such a beginning 
for a teaching film is better than one which begins with a complex 
scene that needs explanation by the commentator. This film 
also utilizes events that would be understandable to young 
people. In order to demonstrate the strength of the elephant it 
is shown pushing a truck out of a ditch. 


IV 


Definite objectives are necessary if motivation is to be effective. 


This generalization by Ryans includes one important impli- 
cation for the maker of teaching films. This implication is that 
the beginning of the film should include some approximation of 
what the student is to learn from the film. The new Erpi 
films on Canada furnish examples where, through animated 
maps, the area under consideration by the film is pointed out at 
the beginning. These maps show the whole of Canada and then 
the particular area to be studied is quickly outlined in white. 
The camera then moves up to a closer view and the film immedi- 
ately changes to scenes dealing with the first item to be con- 
sidered. The student thus gets an approximation of the material 
to be learned and can, therefore, establish some kind of objective. 
Obviously, the only method of establishing objectives in the film 
is through the use of pictures, sound, and words, but in the 
absence of the former two, words can alone assist the student in 
orienting himself. Certainly the words at the beginning of a 
film such as ‘‘ The purpose of this film is to. . . . ”’ can hardly be 
out of order in a teaching film. The most flagrant violators of 
this principle, if it is one, are the makers of documentary films 
frequently used in the social science area. One such film entitled 
Valley Town furnishes an excellent example. This is a good film 
and it is certainly better than none at all, but its purpose is 
confusing, since this is not actually stated until near the end. 





5 This fact is never mentioned. 
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The fact that the statement of the objective is left out of the 
beginning of Valley Town is perhaps heightened by the fact that 
the film deals with social conditions, unemployment, unrest, 
ete. due to technological advancement. In times of manpower 
shortage such a purpose is not quickly realized. Undoubtedly, 
such a procedure in this particular film gives it a greater dramatic 
quality and the question might well arise as to whether attitude- 
producing films should not stake all on a higher dramatic quality. 
Cinematic devices such as animated diagrams, split-frame 
pictures, and dissolves are at the disposal of the producer who 
wishes to avoid the strictly pedagogical approach of stating the 
objective bluntly. 


V 
Pupil interests are important sources of motivation. 


There is a controversy at present in the educational film world 
around the problem of how much entertainment can be placed in 
a teaching film. The work of Walt Disney has perhaps brought 
this matter to the foreground. Some have argued that enter- 
tainment does not have a place in a teaching film because it 
distracts. Others have taken the stand that there is a place for 
some entertainment. 

The above generalization would imply that at least in the case 
of young children motivation would be increased if the teaching 
were presented in an interesting if not entertaining fashion. 
Certainly this was the case in two films previously mentioned, 
Adventures of Bunny Rabbit and Elephants. In the case of the 
latter film it is interesting to note that the story of the elephant 
is told through pictures of a young boy learning to train a pair 
of elephants. Obviously, the young people who view the film 
are going to be interested in the fact that this young man is of 
their own age and will, therefore, look upon the activities of 
elephant training with keen brotherly interest. 

Animated cartoons are a part of every theatre audience’s 
experience. They are a part of the American way of life. As 
such it is not unusual to see them applied to the field of teaching. 
It is interesting to note that one of the most popular educational 
films of today is an animated cartoon utilizing a character known 
to all young cinema goers, Porky Pig. The story of this film, 
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Old Glory, tells how Porky learns what the pledge of allegiance to 
the flag means. As the film opens he is having difficulty memo- 
rizing the pledge and finally throws the book aside and goes to 
sleep. He dreams that Uncle Sam appears. Uncle Sam then 
tells him, through pictures, some facts about the founding of our 
country and what the flag stands for. This method of teaching, 
showing an old friend being put through the paces of learning, is 
apparently a good type of motivation because it involves pupi 
interests. 


VI 


Specific suggestions and directions for learning contribute to the 
students set. 


Specific suggestions are not found very often in any of the three 
types of films listed previously. If these suggestions are found 
in teaching films, such films are usually of the skill type rather 
than the others. Seldom does one find the words, ‘‘ Now this is 
important” or “‘This should be remembered.”’ For some reason 
—perhaps the influence of Hollywood—teaching films have left 
out such simple ways of giving directions for learning. 

There is one set of films on first aid produced by Bell and Howell 
that demonstrates forcefully the usual procedure of enumerating 
the points to be remembered at the end of the film. This 
enumeration is merely a reading of the printed title on the screen. 
It is not accompanied by pictures relating to the five points. 
This reading of the important points is dull and uninteresting. It 
would have been far better to give the important points at the 
beginning, as well as the end, and in such a manner that the 
learner would be motivated to look for examples of each step in 
the body of the film. 

The implication for film producers in this generalization is that 
if functional teaching films are to be made, the traditions of the 
feature film must be abandoned. The school textbook cannot 
be expected to resemble an entertaining novel, for each serves a 
different purpose. The possibilities in animation and other 
cinematic devices cannot be overlooked. A further implication 
is that all teaching films need printed teaching guides for the 
teacher and printed learning guides for the pupil. 
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Vil 


The motivation of learning for a given individual may be difficult if an 
attempt is made to limit the motive or drive to some specific task, set 
of materials or school subject. 


So often teaching films attempt to give the impression that 
theirs is the final word on the subject at hand. This is a poor 
practice from the standpoint of the psychology of motivation. 
Classroom learning takes place in a broader field than one film or 
one textbook'and it would be wise of the film producer to recognize 
at the outset that the film as a medium has limitations as a 
teaching device. 


SUMMARY 


In applying the principles of motivation to the making of 
teaching films the following cautions should be observed: 

1) The complexity of pictures may influence motivation. 

2) Pictures and sound must convey the same meaning. 

3) Unusual lighting and camera angles may have a negative 
effect. 

4) Animation should be used to assist in making clear the 
interrelationships of ideas. 

5) Universal interest in the subject-matter of a film should 
never be assumed. 

6) A whole film may be devoted to the single purpose of 
establishing an attitude. 

7) A teaching film should be made for one grade level only. 

8) The objectives of a teaching film should be stated clearly 
at the beginning. 

9) Young people in films for young people aid motivation. 

10) Familiar characters from the movie world used in teaching 
films may prove to be an efficient device for motivation. 

11) Specific suggestions for learning both in the film and 
through pupil-teacher guides need to be considered. 

12) Limitations of the film medium must be recognized. 
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Of the various methods suggested for studying concept for- 
mation, those developed by Hull and Kuo required the subject to 
abstract common elements from Chinese symbols**. Smoke 
used a somewhat more complex criterion, maintaining that the 
indispensable character of concept formation is a response to 
relationships common to two or more stimulus patterns, rather 
than simply the abstraction of common elements. ® 

For his study, Smoke prepared a set of ‘nonsense concepts.’ 
For example, a DAX was defined as ‘‘a circle with two dots, one 
inside and the other outside the circle.”” All examples of a DAX 
had these features in common, but not others, as, for example, 
the size of the circle, the thickness of the circumference, the size 
of the dots, and the position of the dots (within the limits of the 
definition). The term ‘positive instance’ was used to refer to 
examples of a given concept. A ‘negative instance’ was an 
example which was similar to but not an exact representation of 
the concept. Each ‘negative instance’ violated one, and only 
one, of the conditions essential to the concept. The subjects 
were required to learn the definition of each concept from a 
serial presentation of a number of examples. Learning was 
appraised by asking a subject: (a) to define the concept verbally; 
(b) to draw two examples; (c) to indicate which of sixteen designs 
in a test series were examples of the concept. In the present 
study, only the last method of evaluating learning was used. 

In one experiment Smoke presented alternately ‘positive’ and 
‘negative’ instances to a limited number of college students who 
acted as subjects. Smoke concluded that there was no statisti- 
cally significant evidence to show that concept learning is either 
more or less effective when the series contains both ‘ positive’ and 
‘negative’ instances than when it contains only the former. 
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THE PROBLEM 


For the present study, the materials suggested by Smoke were 
modified so that the learning cards and test series were suitable 
for group use with Grade II pupils. Four experimental situations 
were devised to determine under which conditions Grade II 
pupils are able most effectively to generalize the materials used in 


the investigation. 


THE SUBJECTS 


Four experimental groups, each of forty Grade II pupils, were 
selected by matching the subjects individually for sex, and for 1Q 
obtained from the Otis Group Intelligence Scale, Primary Exami- 
nation, Form A.! The groups were also comparable in CA, no 
difference between means, or sigmas, exceeding 1.2¢p. The boys 
and girls within each experimental group were also comparable in 
IQ and CA. The data are shown in Table I, along with total 
generalizing scores. 


MATERIALS AND PROCEDURES 


Definitions of nine concepts were selected, one to illustrate the 
problem, and eight for experimental purposes. Four ‘positive’ 
and two ‘negative’ instances, each on a 9” x 12” sheet of manila 
tag, constituted the learning material. The ‘positive’ instances 
were labelled P;, P2, P3, Ps, respectively, and the others N; and 
N>», on the reverse side of the cards. For the guidance of the 
subjects, the experimenter printed on each P card the name of 
the concept, for example, DAX. The words “This is not a 
(DAX)” were printed on each N card. 

Finally, a test series of ten designs (items) was prepared for 
each concept, from four to six items (determined by chance) 
being examples of the concept, and the remainder ‘negative’ 
instances. The cards of the test series (or test) were shown in a 
fixed order after the presentation of each learning card. A set of 
instructions was prepared in order to ensure similarity of expla- 
nations in the various classes studying by each method. 

Four experimental procedures were used: 


' The IQ's were made available through the cooperation of Mr. R. Straight 
of the Bureau of Measurements, Vancouver School Board. 
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1) Method I, in Group I. Cards P:, P2, P3, Ps were shown in 
order, each card being exposed for five seconds, and removed 
from sight before the test cards were shown. 

2) Method II, in Group II. The procedure was the same as in 
Method I, using learning cards P;, Ni, P2, Ne, in order. 

3) Method III, in Group III. The same learning cards were 
used as in Method I, but they were left exposed cumulatively 
during the presentation of the test series. Thus the learning 
situation was: P; (5 sec.), Test; P: + P2* (5 sec.) Test; P; + P, 
+ Ps (5 sec.), Test; Pi: + Po + Ps + Py (5 sec.), Test. 

4) Method IV, in Group IV. The same cards as in Method 
II were exposed in the same manner as in Method III. 

The procedure for Method I is given in somewhat greater detail: 

1) Learning card P; was exposed for five seconds, and then 
removed from sight. 

2) The cards of the test series were shown in the predetermined 
order, each being exposed for five seconds. The pupil decided 
whether each card was or was not an example of the concept 
shown in P;. Having made his decision, he circled the ‘yes’ or 
the ‘no’ at the appropriate place in the answer booklet which had 
nine pages, each page having four sets of ten items each: 


Concept 1 
Test 1 Test 2 Test 3 Test 4 
1. yes no 1. yes no 
2. yes no 2. yes no 
3. yes no 3. yes no 


10. yes no 10. yes no 


3) A similar procedure was used for Ps, P3, Py. Test 1 refers 
to the test series shown after P;, and soon. The same test cards 
were used in the same order for each test of a given concept. 

The procedure was the same in Method II, except that the 
second and fourth learning cards were ‘negative’ instances. For 
Methods III and IV the learning cards were not removed from 
sight once they had been exposed. They were placed on a frame 





* The plus sign indicates that the learning cards were exposed simultane- 
ously for the five-second period and were also in view during the test. 








te = 8606 
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and left there until all the cards had been shown and the four test 
series administered. By this method memory was eliminated as 
a factor influencing concept formation in Methods III and IV. 

Each correct response was given a score of one. For each 
concept, then, a subject might have a score varying from 0 to 40 
since there were four tests each of tenitems. The ‘total general- 
izing score’ could vary from 0 to 320, since there were eight 
concepts. 


RESULTS 


Before attempting to answer the main question it was thought 
desirable to analyze the data for sex differences in generalizing 
ability. Total generalizing scores for each sex and for the total 
group for each method are shown in Table I. 


TaBLE I.—MEANS AND STANDARD Deviations oF CA, IQ AND 
ToTaL GENERALIZATION SCORE 























Method | Sex CA IQ Total 

M o M a M o 
I M. | 95.9 | 5.95 121 7.01 222 42.10 
F. 96.5 | 5.22 121 9.91 241 45.44 
Both | 96.2 | 5.57 121 8.58 231 44.81 
II M. | 94.5 | 5.20 122 | 9.32 241 37.20 
F. 95.6 | 5.05 120 6.71 221 27.09 
Both | 95.0 | 5.15 121 8.15 231 34.05 
III M. | 95.6 | 5.79 119 7.53 214 | 36.51 
F. 95.3 | 5.36 121 7.66 218 | 36.48 
Both | 95.5 | 5.58 120 7.66 216 | 35.31 
IV M. | 95.7 | 5.40 121 7.14 203 35.34 
F. 97.2 | 5.27 118 7.71 203 35.15 
Both | 96.5 | 5.39 120 7.51 203 35.25 














The average generalizing scores of boys and girls by Methods 
III and IV were practically identical. The girls in Group I had 
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higher average scores than did the boys, the difference, however, 
being only 1.370. The boys scored higher than the girls by 
Method II, the difference being 1.940,.* The hypothesis is 
suggested that the ‘negative’ instance is of more value to boys 
than to girls, but this must be checked by further experimentation 
with larger groups. 

Since sex differences were not statistically significant it was 
decided to treat the two sexes as members of a single group for 
the ability under investigation. This method of treating the 
data may lead to inconclusive results should there be a real sex 
difference. It frequently happens that different methods of 
analysis lead to different conclusions. For this reason the data 
of the present problem were analyzed by several methods in 
order that comparisons could be made between the results of the 
various analyses. 


TABLE II.—MopiIFIcaTIONs oF TEsT RESPONSES 














Method 
I II III IV 

Per Per . Per a Per 

No. Cent No. Cent No. Cent No. Cent 
Perfect score on Test 

1 30 .09 30 .09 25 .08 27 .08 
Imperfect scores on 

Test 1 290 .91 | 290 .91 | 295 .92 | 293 .92 
Perfected on later 

tests 32 .10 26 .08 22 .07 16 .05 
Improved on later 

tests 108 .34 97 .30 92 .29 61 .19 
Lowered on later 

tests 73 .23 68 .21 72 .22 | 115 36 





























The first method involved a study of the changes in the scores 
obtained on successive tests. A subject was credited with ‘a 
perfect score’ each time he responded correctly to all ten items 





* Throughout this paper the short formula for the standard error of a 
difference was used. Since the members of each pair were matched for IQ 
and CA the use of the longer formula might have increased the sizes of the 
critical ratios. Considering the limited number of cases, the more refined 
technique did not seem to be appropriate. 
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ofatestseries. Ifonany test subsequent to the first he responded 
correctly to all items he was said ‘to have perfected his score on 
a later test.’ If the subject obtained a higher score on tests 2, 3 
or 4 than on test 1 he was credited with ‘improving his score.’ 
Each time his score was lower on later than on the initial test his 
‘score was lowered on later tests.’ The data are shown in 
Table II. 

Some evidence on the original comparability of the groups may 
be obtained from the fact that the total numbers of perfect 
scores on test 1 were almost identical for each group. The 
results of this method of analysis do not give statistically signifi- 
cant differences between methods. There is, however, the 
suggestion that Method IV is least suitable. 

This conclusion is consistent with that obtained by a second 
type of analysis. It was found that eighty, eighty-one, seventy- 
eight and sixty-one per cent of the pupils in Groups I, II, III 
and IV, respectively, responded correctly to more than sixty per 
cent of the items on Test 4. None of the differences in per- 
centages is significant. 

A third method of treating the data lead to similar and more 
conclusive results. The highest score a student could obtain on 
each test is 10, or a maximum score of 80 for all eight concepts. 
Table III shows the average score for each group on each test for 
all concepts. 

There is very little difference between mean scores on Test 4 
of Methods I, II and III. However, the average score of Test 4, 


TABLE IIJI.—MEAN AND SD or Eacu Test ror Eaco Metuop 





Method 





Test I II Iil IV 





AM ;} SD | AM} SD | AM SD {| AM SD 





| 56 | 10.2] 57 7.6 | 54 7.4] 54 8.5 
2 58 | 11.5) 56 8.8 | 52 9.5} 47 | 11.0 
3 58 | 12.6] 59 9.3; 52 | 10.1] 538 9.6 
4 58 | 12.4; 59 | 10.0] 55 9.9; 49 | 11.8 
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Method IV, is lower than the corresponding mean scores of the 
other methods, the differences being 3.32, 4.16 and 2.46 times the 
respective standard errors. 

It is of interest to note the changes in the values of the standard 
deviations for successive test series, additional experience tending 
to produce greater variability within the group. None of the 
differences between standard deviations is significant. 

The next analysis involved the determination for each method 
of the number of tests on which each subject made no error. 
Since there were eight concepts each involving four tests, a 
student could have from none to thirty-two errorless trials. 
The data for this analysis are shown in Table IV. 


TABLE IV.—NvUMBER OF ERRORLESS TESTS 





Method 
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The significance of the difference between means shown in 
Table IV is indicated in Table V. 


TABLE V.—SIGNIFICANCE OF DIFFERENCES IN NUMBER OF 


ERRORLESS TESTS 
Chances in 100 


Method CR of a real difference 
I-II 2.94 99.9 
I-III 2.96 99.9 
I-IV 5.05 100.0 
II-III . 26 60.0 
II-IV 1.85 96.5 
III-IV 1.55 93.5 


Using the customary criterion of significance only one difference 
is statistically significant. However, pending further infor- 
mation it may be said that Method I is superior to Methods II, 
III, and IV and that Method IV leads to the poorest results. 
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A final analysis was made by comparing average scores on all 
four tests of all concepts. Each student’s score could vary from 
0 to 320. The data are shown in Table VI. 


TaBLE VI.—TorTaL Scores BY METHODS 




















Method 
I II III IV 
Mean 231 231 216 203 
SD 44.81 34.05 35.31 35.25 








Again there is the suggestion that Method IV leads to inferior 
results. The significance of the various differences is shown in 


Table VII. 


TaBLE VII.—RELIABILITY OF DIFFERENCES BETWEEN MEAN 
ToTaL SCORES 


Difference D + ap 
I-II .00 
I-III 1.66 
I-IV 3.10 
II-III 1.93 
II-IV 3.60 
ITI-IV 1.65 


Only two of the differences are significant, but it does appear 
that Methods I and II are superior to III and IV. This would 
mean that subjects generalized better when they had to remember 
past experiences than when they simply compared test cards 
with stimuli present to the visual sense. 


RELIABILITY AND VALIDITY 


(a) Reliability (or possibly preferably ‘internal consistency’) 
was estimated by the split-half method, correlating Test 4 scores 
on the odd-numbered with those on the even-numbered concepts. 
The result was stepped-up by the Spearman-Brown Prophecy 
Formula, giving the reliability of the Test 4 scores for all eight 
concepts. The coefficients are shown in Table VIII. 
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TaBLeE VIII.—RELIABILITY COEFFICIENTS 


Method Reliability 
I .92 
II . 92 
III .92 
IV .87 


These results are unusually high for a test of this type. For 
Method I, the predicted reliability coefficient of the total general- 
izing scores on all four tests of the eight concepts would be .98. 
Depending upon the interpretation placed upon this type of 
coefficient it may be concluded either that the test is very 
reliable, or that its internal consistency is extremely high. 

(b) At the present time any statement of validity depends 
upon an analysis of the psychological processes involved, but it 
appears that the test does evaluate ability to make generali- 
zations. Test scores were correlated with MA and reading age. 
Correlations with the former varied from .02 to .22 for the 
different methods, and with the latter from .04 to .39, only the 
last one being significant. 


CONCLUSIONS 


The following conclusions must be regarded as tentative and 
suggestive because of the presence of many differences which were 
not statistically significant: 

1) There is a suggestion of a sex difference in ability to make 
use of ‘negative’ instances, the girls excelling the boys in Method 
I, which employed only ‘positive’ instances, whereas the boys 
made higher scores than the girls when ‘negative’ instances were 
introduced. 

2) In general, the pupils at this level obtain higher generalizing 
scores when they are required to recall past experiences than 
when the memory factor is eliminated. “It is the intense 
effort which educates.” 

3) At the Grade II level, the ability investigated in this 
study correlates low, positively and not significantly with mental 
age and reading age. 

4) Any differences between methods may be more closely 
related to the nature of the instructions than to the differences in 
procedure. This possibility requires further investigation. 
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BOOK REVIEWS 


Harry E. Burrovuaus. Boys in Men’s Shoes. New York: The 
Macmillan Co., 1944, pp. 370. 


This book is essentially a description of the beginning, the 
development, the aims, and the achievements of the Burrough’s 
Newsboys Foundation and of its summer center—Agassiz 
Village. Psychologists will be interested for the most part in the 
counseling program involved. The influence of Burroughs in 
establishing and guiding the counseling in the Foundation can- 
not be overemphasized. He was almost wholly responsible for 
organizing procedures and for choosing the counselors. Dis- 
cussions are organized about illustrative case studies. These, 
however, are too sketchy to yield more than a general notion of 
procedures and results. 

Burroughs is convinced that counselors are born rather than 
produced by training. Consequently he is intolerant of scientific 
procedure and academically trained workers, probably due to 
unfortunate experience with workers who had had inadequate 
training in counseling procedures. Toa certain degree he is right. 
But he fails to realize that counselors who are born that way 
might become more effective in their work if properly trained. 

Ways of getting things done in the Foundation are illustrated 
by the following: (1) Emphasis is placed upon a trial-and-error 
method in influencing youth. (2) Appeal should be made to a 
great variety of interests and opportunities to stimulate the boy 
in question. In addition to bringing opportunities to boys, 
boys should be brought to opportunities. (3) Approval by 
adults of achievement, recognition of desired behavior and 
reward are desirable. (4) A counselor must have close personal 
relations with a boy to influence his behavior. (5) Guidance 
should be given to the ordinary or normal boy as well as to 
problem cases. (6) Personal contact with distinguished indi- 
viduals can stimulate emulation and become a potent force for 
good. (7) Emotional reinforcement should be employed in the 
educative process. (8) Discrimination by the counselor must be 
exercised in making the first contact in times of crisis in a boy’s 
career. 

Important aspects of the counseling involve interviews, stafi 
conferences, followed by trial-and-error procedures. Emphasis is 
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placed upon shaping attitudes of the boys and giving them 
worth-while values. Inspirational discourse has an important 
place here. Final responsibility in a difficult case is taken by 
Burroughs rather than by a counselor. 

Counselors of boys, particularly of newsboys, will find much of 
value in this book. The material is interesting, and, for the most 
part the methods and procedures appear to be psychologically 
sound. It is, however, pretty much a treatise on achievements 
rather than one on methods. The system is so personalized in 
Burroughs that most counselors will have difficulty in gaining 
much from the material in the book. Training under Bur- 
roughs in his Foundation, however, should be very profitable. 

There is no question but that Burroughs is doing an excellent 
job in his Foundation. It is to be hoped that the good work 
may continue after he relinquishes the leadership. 

Mies A. TINKER 


University of Minnesota 


Noe. P. Gist, C. T. PrHLBLAD, AND Ceci, L. Grecory. Selec- 
tive Factors in Migration and Occupation: A Study of Social 
Selection in Rural Missourt. Columbia, Mo.: University of 
Missouri, 1943, pp. 166. $1.50 


When on the one hand one hears of farmers drowning chicks 
to maintain the current price level of eggs, the while our own poor 
and the world’s needy go without; and when on the other hand 
one is told that the farmer’s son leaves the farm and goes to 
work in the city so that he can become rich enough to have a 
home in the country, the question of whether the farmer is 
typified by substantial if at times misguided inteliigence or by 
mental inferiority is immediately raised. It is another strain on 
one’s imagination to evaluate a study of this question by three 
men from the city, of all places! However, there is no question 
but that Gist, Pihlblad, and Gregory in their report entitled 
Selective Factors in Migration and Occupation have gone to great 
pains to learn objectively of the truth about rural Missouri. 

In 1938, 5,461 people (2422 males and 3039 females) who had 
been high-school students one or more years in ninety-seven 
rural Missouri communities between 1920 and 1930 were inter- 
viewed by field workers for data on: (a) the kind of home from 
which they had come, (b) paternal occupation, (c) present 
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occupation, and (d) present place of residence. Most of the 
individuals had lived within a seventy-five mile radius of Colum- 
bia, Missouri. School executives gathered data on the scholastic 
ratings of these individuals while in school, each rating being 
determined by dividing the grade-point average of the student 
by the grade-point average of the school attended. These 
ratings on school achievement were used as indirect measures of 
intelligence. Ratings below 85 were considered inferior, 85-114 
average, and above 114 superior. 

Places of original residence, rural as opposed to city, were 
classified as farm, small village (under 1000) and large village 
(1000-2500). ‘Present’ residence included these in addition: 
Class I, cities between 2500 and 9,999, Class II, cities between 
10,000 and 49,999, and Class III, cities, with 50,000 and over in 
population. 

Paternal occupation and ‘present’ occupation were classified 
as (1) teachers, (2) other professional, (3) clerical workers, (4) 
business [salesmen, proprietors, managers], (5) skilled workers and 
foremen, (6) semi-skilled and unskilled workers, (7) farmers, (8) 
housewives, (9) housekeepers, and (10) unemployed and 
unclassifiable. 

‘Present’ places of residence were designated as in seven 
zones: (1) same address, (2) same county, (3) adjoining county, 
(4) other Missouri counties, (5) states adjoining Missouri, (6) 
other states, and (7) foreign countries. 

The study itself was divided into six areas of investigation, the 
first of which was the relation of selective migration to the type 
of community. A statistically significant difference appeared 
between male and female scholastic attainment, a differnce 
accounted for by the authors as evidence of social! pressure upon 
women to conform, the sports interests of the boys, and taboos 
on male scholastic attainment. The reader is apt to wonder 
whether some of the discrepancy between the numbers 2422 
males and 3039 females is an index of early male departure from 
school to help at home or to get a good job to support the family; 
also to conjecture that excessive absence to help with the farm 
chores or excessive time consumed in helping would account for 
some of the difference in scholastic attainment. 

Females were found to be more migratory than the males, 
although the males seemed present in greater proportions in the 
larger cities. Larger cities tended to attract people of high 
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scholastic index in disproportionate numbers, thus indicating a 
drain on the rural communities of people of that type. 

Sixty per cent of the group studied had moved away from their 
original residence; 20.17 per cent to other counties, and 11.8 per 
cent to other states and foreign countries. 'Women, though more 
migratory than men, tended to go shorter distances. The 
authors felt that State requirements for teacher certification 
kept many women within the State. There was some evidence 
that non-migrants were, in greater proportion, of inferior 
scholastic index. 

About one-third of the males and females of farm and village 
origin took up adult life in the same home. About half the farm- 
bred males either did this or moved only short distances. Vil- 
lage-bred males and females made longer moves, perhaps because 
of better orientation in vocational opportunities. 

Occupational data on 2142 males and 2773 females showed 
two-thirds of the females to be housewives, and the rest mostly 
teachers, clerical workers and housekeepers; one-fourth of the 
males in business, one-fifth farmers, and the rest evenly dis- 
tributed, with clerical workers the least represented. The 
scholastic index for male teachers was proportionately greater 
than for women teachers and exceeded the male averages for 
other occupational groups. The male average in other pro- 
fessions exceeded that in clerical work, and successively lower, 
in order, were business, farmers, unskilled and skilled workers 
group averages. The scholastic index for women teachers led 
the other occupational indices, which were successively as 
follows: clerical, other professional, housewives, skilled, business, 
housekeepers, and unskilled. 

Sons and daughters of professional men and of teachers aver- 
aged a higher percentage of superior scholarship than the sons of 
‘working men’ and of farmers. Children of farmers averaged 
closer to the population average in scholastic index, while 
children of clerical workers and business men were somewhat 
above average. Children of farm owners had a better average 
than children of farm tenants. 

Married and unmarried females averaged about the same in 
scholastic index, this fact suggesting that the presence of brains, 
contrary to some prejudiced opinion, is not an insuperable 
barrier to marriage, at least in Missouri. Superior women 
showed a slight tendency to marry men in occupations attracting 
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superior men. The mutual attraction of inferior women to men 
in occupations attracting inferior men was also indicated. 

There were evidences of shift in occupation from one generation 
to another. While sixty-two per cent of the fathers had been 
farmers, only about twenty-two per cent of the sons remained in 
that occupation. Thirty-five per cent of the daughters became 
teachers, whereas only .01 per cent of the fathers had been. 
Tenant farmers’ sons tended to seek other occupations, usually 
in business, unskilled, or clerical groups. Farm owners’ sons 
tended to be farmers, teachers, or other professional. 

At the same time certain occupational inheritances were 
apparent. Thirty-four per cent of the sons followed their 
fathers’ occupational classification. Nearly half the business 
men’s sons became business men, and 39.3 per cent of the 
farmers’ sons became farmers. Only 4.5 per cent of the 
daughters followed their fathers’ occupations. The authors 
speak of ‘lower’ and ‘higher’ occupational status; whether as 
judged by the scholastic index average for each occupation or by 
financial remuneration or some other index they do not say. 
They note a shift from ‘lower’ to ‘higher’ occupational levels 
and attribute this to public education and expansion of oppor- 
tunities in the ‘upper’ regions. A larger percentage of women 
married men in their fathers’ occupational classifications than 
could have been expected by chance, and those who married 
‘out-of-bounds’ were not far ‘below’ or ‘above.’ Both the 
authors’ terminology and the last mentioned finding seem to 
reflect the unconscious discrimination and pressure of an undemo- 
cratic society. 

Cities attracted professional and clerical men and women, the 
sons and daughters of professional men, the sons of skilled 
workers, and the daughters of clerical workers and farm owners. 
Male and female teachers seemed attracted to villages. About 
one-third of the professional men and women went outside the 
State, while eighty to ninety per cent of the teachers remained 
within the State. Most mobile were sons of skilled workmen, 
professionals, business, and unskilled fathers. Parental occu- 
pation was a less potent factor apparently than the filial occu- 
pation in range of migration. 

There appeared to be a positive relationship between the 
amount of schooling and the scholastic index, and between the 
tendency to migrate to other communities and scholastic index. 
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Those longer educated and the better in scholastic attainment 
seemed prone to leave home, and few such individuals turned to 
farming. 

Sixty per cent of the group studied stopped school attendance 
at the end of high school. Teachers were better educated as a 
group than other occupational groups. Other professional came 
next, then clerical, business, and, on a par, skilled and unskilled 
workers and farmers. The children of tenant farmers usually 
had only high-school training. Village boys tended to remain 
in school longer than the boys reared on the farm. Again, this 
raises the question of whether the reason was the relative needs 
of the son’s help with the father’s occupation. 

The upshot of this study seems to be that the cities, remote 
communities, and other occupations than agriculture are drain- 
ing the farm of its better educated and supposedly mentally- 
abler offspring; that to offset this tendency is (1) the presence of a 
certain counter-force motivating children to stay in the same 
place, to remain within and to marry within their fathers’ occu- 
pational groups, and (2) the fact that no occupational group and 
no type of community is clearly bereft of talent or utterly 
inferior to another in the scholastic attainments of its members. 
There is, therefore, no immediate and absolute necessity for 
concern over the length of time the children of moronic farners 
can be depended upon to regress toward the mean. On the 
other hand, there is something to be desired in the attraction of 
more able and well-educated men to participation and leadership 
in so essential an occupation as farming is to our national life. 
The authors feel that our educational system must contribute 
more substantially to the needs of rural groups and point out 
that, if education makes for better citizenship, our farmers are 
less well-prepared for taking responsibility in a democracy than 
the better educated occupational groups. 

This study was a tremendous undertaking, in the collection of 
data, the statistical treatment, and the interpretation. Sitting 
at ease in an armchair, however, one is quick to wish that the 
study could have been of larger scope and that a few more ques- 
tions could have been included in the door-to-door investigation. 
Rather than satisfying the reader, the data are constantly raising 
questions, some of which the authors try to answer inferentially. 
There are so many reasons why a student’s scholastic index should 
be low, why the sons of tenant farmers should leave school early, 
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why a girl marries father’s apprentice, why girls stay nearer 
home, why skilled workers go to the cities, and so forth, that the 
reader aches for more detailed information and perhaps a few 
intimate case studies. Furthermore, the occupational groupings 
are so comprehensive as to seem almost meaningless. One 
may be a salesman having to know only a few items of mer- 
chandise or a manager of a puzzling, complicated chain store and 
yet be in the same category. One may be a nurse, a doctor, 
a dentist, a chiropractor, a minister, and still be ‘other profes- 
sional.’ One may be a filing clerk, an adding-machine operator, 
or a private secretary and be clerical. A study of the relation 
of scholastic index to these groupings seems almost useless when 
the index is governed by many unknown factors and a given 
occupational category contains jobs that demand anything from 
little ability to a vast amount. 

An important contribution by this study is its exploration of a 
danger and an asset which we have here in America. We havea 
tendency toward immobility in residence, marriage, and occupa- 
tion, which presents the danger of provincialism. We have the 
countertendency of mobility in residence, marriage, and occupa- 
tion, which is a means of breaking down the social barriers which 
ignorance of other places and peoples may otherwise establish. 
In the past fifteen years a depression and a war have taught us 
much about each other and about democracy. We must not 
lose this knowledge. The reviewer agrees with the authors that 
curricular adjustment to the needs of various groups is impera- 
tive. But along with this the reviewer would propose school 
insurance against the snobbery of financial and occupational 
status and of mental ability—an insurance based upon the 
dignity of any constructive work done to the best of one’s ability. 
Only with such insurance can we hope to be an efficiently produc- 
tive nation. ConsTaNcE M. McCuLiouGu 

Western Reserve University 


CORRECTION 


In the fourth line of Table 2, page 94, of the February issue of 
this JOURNAL, in the article entitled ‘“‘Coéperation versus Individ- 
ual Efficiency in Problem-solving,” by Samuel F. Klugman, the 
second SD should read 6.24, instead of 66.24 as printed. Ed. 














